Peirce may have been our greatest thinker, but his line in this context almost sounds scary. Nothing could be more antithetical to intellectual reform than an appeal against thoughtful scrutiny of our most hidebound mental habits—notions so "obviously" true that we stopped thinking about them generations ago, and moved them into our hearts and bosoms. Please do not forget that the sun really does rise in the east, move through the sky each day, and set in the west. What knowledge could be more visceral than the earth’s central stability and the sun’s subordinate motion?
Darwin was born on the same day as Lincoln, and "officially" inaugurated the revolution that bears his name when he published the Origin of Species in 1859. During the centennial celebrations in 1959, the great American geneticist H. J. Muller dampened festivities with an address titled "One Hundred Years Without Darwin Are Enough." Muller treated the revolution’s failure to penetrate at two opposite ends of a spectrum— creationism’s continuing hold over much of American pop culture, and limited understanding of natural selection among well-educated people content with the factuality of evolution.
But I think that something even larger, and standing in the middle of this spectrum, has always ranked as the greatest impediment to completing the Darwinian revolution. Freud was right in identifying suppression of human arrogance as the common achievement of great scientific revolutions. Darwin’s revolution—the acceptance of evolution with all major implications, the second blow in Freud’s own series—has never been completed. In Freud’s terms, the revolution will not be fulfilled when Mr. Gallup can find no more than a handful of deniers, or when most Americans can give an accurate epitome of natural selection. Darwin’s revolution will be completed when we smash the pedestal of arrogance and own the plain implications of evolution for life’s nonpredictable nondirectionality—and when we take Darwinian topology seriously, recognizing that Homo sapiens, to recite the revised litany one more time, is a tiny twig, born just yesterday on an enormously arborescent tree of life that would never produce the same set of branches if regrown from seed. We grasp at the straw of progress (a desiccated ideological twig) because we are still not ready for the Darwinian revolution. We crave progress as our best hope for retaining human arrogance in an evolutionary world. Only in these terms can I understand why such a poorly formulated and improbable argument maintains such a powerful hold over us today.
3
Different Parsings, Different Images of Trends
Fallacies in the Reading and Identification of Trends
The more important the subject and the closer it cuts to the bone of our hopes and needs, the more we are likely to err in establishing a framework for analysis. We are story-telling creatures, products of history ourselves. We are fascinated by trends, in part because they tell stories by the basic device of imparting directionality to time, in part because they so often supply a moral dimension to a sequence of events: a cause to bewail as something goes to pot, or to highlight as a rare beacon of hope.
But our strong desire to identify trends often leads us to detect a directionality that doesn’t exist, or to infer causes that cannot be sustained. The subject of trends has inspired and illustrated some of the classic fallacies in human reasoning. Most prominently, since people seem to be so bad at thinking about probability and so prone to read pattern into sequences of events, we often commit the fallacy of spotting a "sure" trend and speculating about causes, when we observe no more than a random string of happenings.
In the classic case, most people have little sense of how often an apparent pattern will emerge in purely random data. Take the standard illustration of coin flipping: we compute the probability of sequences by multiplying the chances of individual events. Since the probability for heads is always 1/2, the chance of flipping five heads in a row is 1/2 × 1/2 × 1/2 × 1/2 × 1/2, or one in thirty-two-rare to be sure, but something that will happen every once in a while for no reason but randomness. Many people, however, particularly if they are betting on tails, will read five heads in a row as prima facie evidence of cheating. People have been shot and killed for less—in life as well as in Western movies.
In my favorite, more subtle example of the same error, T. Gilovich, R. Vallone, and A. Tversky debunked a phenomenon that every basketball fan and player absolutely "knows" to be true—"hot hands," or streaks of successive baskets, magic minutes of "getting into the groove" or "finding the range," when every shot hits. The phenomenon sounds so obvious: when you’re hot you’re hot, and when you’re not you’re not. But "hot hands" does not exist. My colleagues studied every basket made by the Philadelphia 76ers for more than a season. They made two debunking discoveries: first, the probability of making a second basket did not rise following a successful shot; second, and more important, the number of "runs," or successful baskets in sequence, did not exceed the predictions of a standard random, or coin-tossing, model. Remember that, on average, you will flip five heads in a row once in every thirty-two sequences of five tosses. We can, by analogy, compute expected runs for any basketball player. Suppose that Mr. Swish, a particularly good shooter, succeeds in 60 percent of his field-goal attempts. He should then notch six baskets in a row once every 20 sequences or so (0.6 x 0.6 × 0.6 x 0.6 x 0.6 × 0.6, for 0.047, or 4.7 percent). If Swish’s actual play includes sequences of six at about this rate, then we have no evidence for hot hands, but only for Swish playing in his characteristic manner for each shot independently. Gilovich, Vallone, and Tversky found no sequences beyond the range of random expectations.
My colleague Ed Purcell, a Nobel Prize winner in physics but just a keen baseball fan in this context, then did a similar study of baseball streaks and slumps, and we published the results together (Gould, 1988). Purcell found that among all runs, the subject of so much mythology about heroes (and goats), only one record stands beyond reasonable probability, and should not have happened at all—Joe DiMaggio’s fifty-six-game hitting streak in 1941—thus validating the feeling of many fans that DiMaggio’s splendid run is the greatest achievement in modern sports (and exonerating all the poor schlumps whose runs of failure lie entirely within the expectations of their characteristic probabilities!).
As one final example, probably more intellectual energy has been invested in discovering (and exploiting) trends in the stock market than in any other subject—for the obvious reason that stakes are so high, as measured in the currency of our culture. The fact that no one has ever come close to finding a consistent way to beat the system—despite intense efforts by some of the smartest people in the world—probably indicates that such causal trends do not exist, and that the sequences are effectively random.
In the second most prominent fallacy about trends, people correctly identify a genuine directionality, but then fall into the error of assuming that something else moving in the same direction at the same time must be acting as the cause. This error, the conflation of correlation with causality, arises for the obvious reason (once you think about it) that, at any moment, oodles of things must be moving in the same direction (Halley’s comet is receding from earth and my cat is getting more ornery)—and the vast majority of these correlated sequences cannot be causally related. In the classic illustration, a famous statistician once showed a precise correlation between arrests for public drunkenness and the number of Baptist preachers in nineteenth-century America. The correlation is real and intense, but we may assume that the two increases are causally unrelated, and that both arise as consequences of a single different factor: a marked general increase in the American population.
The error detailed in this book has not often been named or identified, but may be just as prominent in our fallacious thinking about trends. I shall focus on two central examples from two dramatically different cultural realms: "Why does no one hit 0.400 anymore in baseball?" and "How does progress characterize the history of life?" These are classic trends, in the sense that each encapsulates the essence and history of an important institution, and both have moral implications—one, in baseball, apparently trying to tell us that something about modern life causes excellence, or old-fashioned virtue, to degenerate; the other, for life, providing our necessary solace and excuse for continuing to view ourselves as lords of all.
I shall not use the juxtaposition of these examples to present pap and nonsense about how life imitates baseball, or vice versa. But I will show that the same error has led us to view both trends the wrong way round. Straighten out the fallacy, and you will see that the disappearance of 0.400 hitting illustrates the increasing excellence of play in baseball (however paradoxical such a claim may sound at first)—while life, on the other hand, shows no general thrust to improvement, but just adds an occasional exemplar of complexity in the only region of available anatomical space, while maintaining, for more than 3 billion years, an unvarying bacterial mode. Baseball has improved, but life has always been, and will probably always remain until the sun explodes, in the Age of Bacteria.
The common error lies in failing to recognize that apparent trends can be generated as by-products, or side consequences, of expansions and contractions in the amount of variation within a system, and not by anything directly moving anywhere. Average values may, in fact, stay constant within the system (as average batting percentages have done in major-league baseball, and as the bacterial mode has remained for life)—while our (mis)perception of a trend may represent only our myopic focus on rare objects at one extreme in a system’s variation (as this periphery expands or contracts). And the reasons for expansion or contraction of a periphery may be very different from causes for a change in average values. Thus, if we mistake the growth or shrinkage of an edge for movement of an entire mass, we may devise a backwards explanation. I will show that the disappearance of 0.400 hitting marks the shrinkage of such an edge caused by increasing excellence in play, not the extinction of a cherished entity (which would surely signify degeneration of something, and a loss of excellence).
Let me illustrate this unfamiliar concept with a simple (and silly) example to show how, in two cases, an apparent trend may arise only by expansion or contraction of variation. In both cases we tend to misinterpret a phenomenon because we maintain such strong preferences for viewing trends as entities moving somewhere.
The one hundred inhabitants of a mythical land subsist on an identical diet and all weigh one hundred pounds. In my first case, an argument about nutrition develops, with some folks pushing a new (and particularly calorific) brand of cake, and others advocating increased abstemiousness. Most members of the population don’t give a damn and stay where they are, but ten folks eat copious amounts of cake and now average 150 pounds, while ten others run and starve to reach an average weight of fifty pounds. The mean of the population hasn’t altered at all, remaining right at its old value of one hundred pounds—but variation in weight has expanded markedly (and symmetrically in both directions).
Cake-makers, pushing the aesthetic beauty of the new and fuller look, might celebrate a trend to greater weight by focusing on the small subset of people under their influence, and ignoring the others—just as the running-and-dieting moralists might exalt twigginess and praise a supposed trend in this direction by isolating their own small subset. But no general trend has occurred at all, at least in the usual sense. The average of the population has not altered by a single pound, and most people (80 percent in this case) have not varied their weight by an ounce. The only change has been a symmetrical expansion of variation on both sides of a constant mean weight. (You may recognize this increased spread as significant, of course, but we usually don’t describe such nondirectional changes as "trends.")
You may choose to regard this example as both silly and transparent. Few of us would have any trouble identifying the actual changes, and we would laugh the shills of both cake-makers and runner-dieters out of town, if they tried to pass off the changes in their small subset as a general trend. But bear with me, for I shall show that many phenomena often perceived as trends, and either celebrated or lamented with gusto and acres of printer’s ink—the disappearance of 0.400 hitting among them—also represent symmetrical changes of variation around constant mean values, and therefore display the same fallacy, though better hidden.
My second case features a totalitarian society ruled by the runner-dieters. They have been pushing their line for so long that everyone has succumbed to social pressure and weighs fifty pounds. A more liberal regime takes over and permits free discussion about ideal weights. Fine, but for one catch imposed by physiology rather than politics: fifty pounds is the lower limit for sustaining life, and no one can get any thinner. Therefore, although citizens are now free to alter their weight, only one direction of change is possible. The great majority of inhabitants remain content with the old ways and elect to maintain themselves at fifty pounds. Fifteen percent of the population revels in its newfound freedom and begins to gain weight with abandon. Six months later, these fifteen individuals average seventy-five pounds; after a year, one hundred pounds; and after two years, 150 pounds.
The statistical spin doctors for the fat fifteen now step in. They argue that their clients’ point of view is sweeping through the whole society, as clearly indicated by the steady increase of mean weight for the entire population. And who can deny their evidence? They even present a fancy graph (shown here as Figure 3). Before the liberation, average weight stood at fifty pounds; after six months the mean rises to 53.8 pounds (the average for eighty-five remaining at fifty pounds, and fifteen rising to seventy-five pounds); after a year to 57.5 pounds; and after two years to sixty-five pounds (an increase of 30 percent from the original fifty)—a steady, unreversed, and substantial rise.
Again, you may view this example as silly (and purposely chosen to illustrate the obvious nature of the point, once you understand the whole system and its variation). Few people would be fooled, so long as they grasped the totality of the story, and knew that most members of the population had not changed their weight, and that the steady increase in mean values arises as an artifact produced by amalgamating two entirely different subpopulations—a majority of stalwarts with a minority of revolutionaries. But suppose you didn’t appreciate the whole tale, and only listened to the statistical spin doctors for the fat fifteen. Suppose, in addition, that you tended to imbue mean values (as I fear most of us do) with a reality transcending actual individuals and the variation among them. You might then be persuaded from Figure 3 that a general trend has swept through the population, thrusting it as a whole toward greater average weights.
FIGURE 3 Average weight of my hypothetical population plotted against time to show how a false impression of an overall trend may be generated.
We are more likely to be fooled by the second case, where limits to variation on one side of the average permit change in only one direction. The rise of mean values isn’t "false" in this second case, but the supposed trend is surely misleading in the sense of Mark Twain’s or Disraeli’s famous line (the quote has been attributed to both) about three kinds of falsification—"lies, damned lies, and statistics." I will present the technicalities later, but let me quickly state why such false impressions can emerge from correct data in this case—as so often exploited by economic pundits and political spin doctors. As in the cliché about skinning cats, there is more than one way to represent an "average." The most common method, technically called the mean, instructs us to add up all the values and divide by the number of cases. If ten kids have ten dollars among them, the mean wealth per kid is one dollar. But means can be grossly misleading—and never more so than in the type of example purposely chosen above: when variation can expand markedly in one direction and little or not at all in the other. For means will then drift toward the open end and give an impression (often quite false) that the whole population has moved in that direction.