Read A Field Guide to Lies: Critical Thinking in the Information Age Online
Authors: Daniel J. Levitin
Alternative explanations are often critical to legal arguments in criminal trials. The framing effects we saw in Part One, and the failure to understand that conditional probabilities don’t work backward, have led to many false convictions.
Proper scientific reasoning entails setting up two (or more) hypotheses and presenting the probabilities for both. In a courtroom, attorneys shouldn’t be focusing on the probability of a match, but the probability of two possible scenarios: What is the probability that the blood samples came from the same source, versus the probability that they did not? More to the point, we need to compare the probability of a match given that the subject is guilty with the probability of a match given that the subject is innocent. Or we could compare the probability that the subject is innocent given the data, versus the probability that the subject is guilty given the data. We also need to know the accuracy of the measures.
The FBI announced in 2015 that microscopic hair analyses were incorrect 90 percent of the time.
Without these pieces of information, it is impossible to decide the case fairly or accurately. That is, if we talk only in terms of a match, we’re considering only one-sided evidence, the probability of a match given the hypothesis that the criminal was at the scene of the crime. What we don’t know is the probability of a match given alternative hypotheses. And the two need to be compared.
This comes up all the time.
In one case in the U.K., the suspect, Dennis Adams, was accused based solely on DNA evidence. The victim failed to pick him out of a lineup, and in court said that Adams did not look like her assailant. The victim added that Adams appeared two decades older than the assailant. In addition, Adams had an alibi for the night in question, which was corroborated by testimony from a third party. The only evidence the prosecution presented at trial was the DNA match. Now, Adams had a brother, whom the DNA would also have matched, but there was no additional evidence that the brother had committed the crime, and so investigators didn’t consider the brother. But they also lacked additional evidence against Dennis—the
only
evidence they had was the DNA match. No one in the trial considered the alternative hypothesis that it might have been Dennis’s brother. . . . Dennis was convicted both in the original trial and on appeal.
Built by the Ancients to Be Seen from Space
You may have heard the speculation that human life didn’t really evolve on Earth, that a race of space aliens came down and seeded the first human life. This by itself is not implausible, it’s just that there is no real evidence supporting it. That doesn’t mean it’s not true, and it doesn’t mean we shouldn’t look for evidence, but the fact that something
could
be true has limited utility—except perhaps for science fiction.
A 2015 story in
the
New York Times
described a mysterious formation on the ground in Kazakhstan that could be seen only from space.
Satellite pictures of a remote and treeless northern steppe reveal colossal earthworks—geometric figures of squares, crosses, lines and rings the size of several football fields, recognizable only from the air and the oldest estimated at 8,000 years old.
The largest, near a Neolithic settlement, is a giant square of 101 raised mounds, its opposite corners connected by a diagonal cross, covering more terrain than the Great Pyramid of Cheops. Another is a kind of three-limbed swastika, its arms ending in zigzags bent counterclockwise.
It’s easy to get carried away and imagine that these great designs were a way for ancient humans to signal space aliens, perhaps following strict extraterrestrial instructions. Perhaps it was an ancient spaceship landing pad, or a coded message, something like “Send more food.” We humans are built that way—we like to imagine things that are out of the ordinary. We are the storytelling species.
Setting aside the rather obvious fact than any civilization capable of interstellar flight must have had a more efficient communication technology at their disposal than arranging large mounds of dirt on the ground, an alternative explanation exists. Fortunately, the
New York Times
(although not every other outlet that reported the story) provides it, in a quote from Dimitriy Dey, the discoverer of the mysterious stones:
“I don’t think they were meant to be seen from the air,” Mr. Dey, 44, said in an interview from his hometown, Kostanay, dismissing outlandish speculations involving aliens and Nazis. (Long before Hitler, the swastika was an ancient and near-universal design element.) He theorizes that the figures built along straight lines on elevations were “horizontal observatories to track the movements of the rising sun.”
An ancient sundial explanation seems more likely than space aliens. It doesn’t mean it’s true, but part of information literacy and evaluating claims is uncovering plausible alternatives, such as this.
The Missing Control Group
The so-called Mozart effect was discredited because the experiments, showing that listening to Mozart for twenty minutes a day temporarily increased IQ, lacked a control group. That is, one group of people was given Mozart to listen to, and one group of people was given nothing to do. Doing nothing is not an adequate control for doing something, and it turns out if you give people something to do—almost anything—the effect disappears. The Mozart effect wasn’t driven by Mozart’s music increasing IQ, it was driven by the boredom of doing nothing temporarily decreasing effective IQ.
If you bring twenty people with headaches into a laboratory and give them your new miracle headache drug and ten of them get better, you haven’t learned anything. Some headaches are going to get better on their own. How many? We don’t know. You’d need to have a control group of people with similar ages and backgrounds, and reporting similar pain. And because just the belief that you might get better can lead to health improvements, you have to give the control group something that enables that belief as much as the medicine under study. Hence the well-known placebo, a pill
that is made to look exactly like the miracle headache drug so that no one knows who is receiving what until after the experiment is over.
Malcolm Gladwell spread an invalid conclusion in his book
David and Goliath
by suggesting that people with dyslexia might actually have an advantage in life, leading many parents to believe that their dyslexic children should not receive the educational remedies they need. Gladwell fell for the missing control condition. We don’t know how much
more
successful his chosen dyslexics might have been if they had been able to improve their condition.
The missing control group shows up in everyday conversation, where it’s harder to spot than in scientific claims, simply because we’re not looking for it there. You read—and validate—a new study showing that going to bed every night and waking up every morning at the same time increases productivity and creativity. An artist friend of yours, successful by any measure, counters that she’s always just slept whenever she wanted, frequently pulling all-nighters and sometimes sleeping for twenty hours at a time, and she’s done just fine. But there’s a missing control group.
How much
more
productive and creative might she have been with a regular sleep schedule? We don’t know.
Two twins were separated at birth and reared apart—one in Nazi Germany and the other in Trinidad and Venezuela. One was raised as a Roman Catholic who joined the Hitler Youth, the other as a Jew.
They were reunited twenty-one years later and discovered a bizarre list of similar behaviors that many fascinated people could only attribute to genetics: Both twins scratched their heads with their ring finger, both thought it was funny to sneak up on strangers and sneeze loudly. Both men wore short, neatly trimmed mustaches and rectangular wire-rimmed glasses, rounded at the corner. Both wore blue shirts with epaulets and military-style pockets. Both had the same gait when walking, and the same way of sitting in chairs. Both loved butter and spicy food, flushed the toilet before and after using it, and read the endings of books first. Both wrapped tape around pens and pencils to get a better grip.
Stories like this may cause you to wonder about how our behaviors are influenced by our genes. Or if we’re all just automatons, and our actions are predetermined. How else to explain such coincidences?
Well, there are two ways, and they both boil down to a missing control group. A social psychologist might say that the world tends to treat people who look alike in similar ways. The attractive are treated differently from the unattractive, the tall differently from the short. If there’s something about your face that just looks honest and free of self-interest, people will treat you differently from how they would if your face suggests otherwise. The brothers’ behaviors were shaped by the social world in which they live. We’d need a control group of people who are not related, but who still look astonishingly alike, and were raised separately, in order to draw any firm conclusions about this “natural experiment” of the twins separated at birth.
A statistician or behavioral geneticist would say that of the thousands upon thousands of things that we do, it is likely that any two strangers will share some striking similarities in dress, grooming, penchant for practical jokes, or odd proclivities if you just look long enough and hard enough. Without this control group—bringing strangers together and taking an inventory of their habits—we don’t know whether the fascinating story about the twins is driven by genetics or pure chance. It may be that genetics plays a role here, but probably not as large a role as we might think.
Cherry-picking
Our brains are built to make stories as they take in the vastness of the world with billions of events happening every second. There are apt to be some coincidences that don’t really mean anything. If a long-lost friend calls just as you’re thinking of her, that doesn’t mean either of you has psychic powers. If you win at roulette three times in a row, that doesn’t mean you’re on a streak and should bet your last dollar on the next spin. If your non-certified mechanic fixes your car this time, it doesn’t mean he’ll be able to do it next time—he may just have gotten lucky.
Say you have a pet hypothesis, for example, that too much Vitamin D causes malaise; you may well find evidence to support that view. But if you’re looking only for supporting evidence, you’re not doing proper research, because you’re ignoring the contradictory evidence—there might be a little of this or a lot, but you don’t know because you haven’t looked. Colloquially, scientists call this “cherry-picking” the data that suit your hypothesis. Proper research demands that you keep an open mind about any issue, and try to valiantly consider the evidence for and against, and then form an evidence-based (not a “gee, I wish this were so”–based) conclusion.
A companion to the cherry-picking bias is selective windowing. This occurs when the information you have access to is unrepresentative of the whole. If you’re looking at a city through the window of a train, you’re only seeing a part of that city, and not necessarily a representative part—you have visual access only to the part of the city with train tracks running through it, and whatever biases may attach to that. Trains make noise. Wealthier people usually occupy houses away from the noise, so the people who are left living near
the tracks tend to have lower income. If all you know of a city is who lives near the tracks, you are not seeing the entire city.
This is of course related to the discussion in Part One about data gathering (how data are collected), and the importance of obtaining representative samples. We’re trying to understand the nature of the world—or at least a new city that the train’s passing through—and we want to consider alternative explanations for what we’re seeing or being told. A good alternative explanation with broad applicability is that you’re only seeing part of the whole picture, and the part you’re not seeing may be very different.
Maybe your sister is proudly displaying her five-year-old daughter’s painting. It may be magnificent! If you love the painting, frame it! But if you’re trying to figure out whether to invest in the child’s future as the world’s next great painter, you’ll want to ask some questions: Who cropped it? Who selected it? How big was the original? How many drawings did the little Picasso make before this one? What came before and what came after? Through selective windowing, you may be seeing part of a series of brilliant drawings or a lovely little piece of a much larger (and unimpressive) work that was identified and cropped by the teacher.
We see selective windowing in headlines too. A headline might announce that “three times more Americans support this new legislation than oppose it.” Even if you satisfy yourself, based on the steps in Part One of the
Field Guide
, that the survey was conducted on a representative and sufficiently large sample of Americans, you can’t conclude that the majority of Americans support the legislation. It could well be that 1 percent oppose it, 3 percent support it, and 94 percent remain undecided. Translate this same kind of monkeyshines to an election headline stating that five times as many
Republicans support Candidate A than Candidate B for the presidential primaries. That may be true, but the headline might leave out that Candidate C is polling with 80 percent of the vote.
Try tossing a coin ten times. You “know” that it should come up heads half the time. But it probably won’t. Even if you toss it 1,000 times, you probably won’t get exactly 500 heads. Theoretical probabilities are achieved only with an infinite number of trials. The more coin tosses, the closer you’ll get to fifty-fifty heads/tails. It’s counterintuitive, but there’s a probability very close to 100 percent that somewhere in that sequence you’ll get five heads in a row. Why is this so counterintuitive? We didn’t evolve brains with a sufficient understanding of what randomness looks like. It’s not usually heads-tails-heads-tails, but there are going to be runs (also called streaks) even in a random sequence. This makes it easy to fool someone. Just make a cell phone video recording of yourself tossing a coin 1,000 times in a row. Before each toss, say, “This is going to be the first of five heads in a row.” Then, if you get a head, before the next toss, say, “This is going to be the second of five heads in a row.” If the next one is a tail, start over. If it’s not, before you make the next toss, say, “This is going to be the third of five heads in a row.” Then just edit your video so that it only includes those five in a row. No one will be any the wiser! If you want to really impress people, go for ten in a row! (There’s roughly a 38 percent chance of that happening in 1,000 tosses. Looking at this another way,
if you ask a hundred people in a room to toss a coin five times, there is a 96 percent chance that one of them will get five heads in a row.)