Read The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball Online
Authors: Benjamin Baumer,Andrew Zimbalist
One of the more quickly assimilated pearls of sabermetric wisdom was the avoidance of the sacrifice bunt. Many sabermetricians have written about the technique, generally finding it to be associated with a decrease in run scoring, rather than an increase. The argument usually harkened back to the expected run matrix (see the
Appendix
), which contains the number of runs an average team can expect to score in the remainder of an inning, given the configuration of the baserunners and the number of outs. Suppose we have a runner on first and nobody out. Then the expected number of runs scored in the remainder of the inning is about 0.87. If we successfully sacrifice bunt, then we have a runner on second and one out. But in this case the expected number of runs scored is only about 0.67, which suggests that our strategic ploy has cost us about one-fifth of a run.
68
In a competitive game, you simply cannot afford to give that away.
Or so went the thinking. By the mid- to late 2000s, sabermetrically oriented teams like Boston, Oakland, and Toronto eschewed the sacrifice bunt as a tool for generating offense, sacrifice bunting fewer than 20 times per season, or less than once per week. Conversely, the notably sabermetrically averse Colorado Rockies sacrificed 119 times in 2006, or nearly once per game.
69
Indeed, Oakland’s bunting declined from about once every three games during the first half of Sandy Alderson’s tenure, to about once every seven games in the five years following the publication of
Moneyball
. Interestingly, they have rebounded somewhat (about once every five games) over the past five years. Why might this change have occurred?
It turns out that while the initial observation that sacrifice bunting caused a decline in run expectation was not erroneous, it was foolish to extrapolate that the sacrifice bunt is always a bad play. For starters, the goal of maximizing expected runs is probably wise to pursue at the beginning and middle of the game, but toward the end of a close game, you are more likely to want to maximize the probability of scoring at least one run. Moreover, as the authors of
The Book
emphasize, the expected run values need to be put in context. The pitcher and batter are both rarely of league-average quality, so the calculation changes with each and every plate appearance.
Moreover, the manager doesn’t get to choose to sacrifice bunt—he only has the option to choose to
try
to sacrifice bunt. The consequences of failing to sacrifice are varied, and usually even more harmful. And, of course, in the National League the person usually sacrifice bunting is the pitcher, who may be a far worse hitter, but a better bunter, a slower runner, a poorer judge of the strike zone, and so on.
Here again, while the initial sabermetric insight was revolutionary, in that it gave us a systematic way to analyze a strategy that had been in use for decades, it has been superseded years later by a more nuanced understanding of the issue that is much closer to the conventional wisdom. This is not to say that we have learned nothing, and certainly we are far better positioned to make progress in the future, but it is unfortunate that the early adopters were so vociferous that sabermetricians are now thought to believe things that they no longer do.
In 1900, the famed German mathematician David Hilbert published a list of twenty-three unsolved problems in mathematics, which he considered to be of particular importance. His list has been greatly influential to mathematicians, who as of this writing, had resolved about half of the problems, and partially resolved another quarter. Among the remaining unresolved problems is the Riemann hypothesis, which at this point is probably the most famous unproven conjecture in all of mathematics. In 2004, before leaving Baseball Prospectus to become the director of baseball research and analytics for the Cleveland Indians, Keith Woolner published a list of twenty-three unresolved questions in sabermetrics. While Woolner’s list has been less influential than Hilbert’s, it covers a variety of topics that are of universal interest, including, first and foremost, the question of “separating defense into pitching and fielding” discussed above.
70
Woolner’s Hilbert problems were separated into seven major categories: defense, offense, pitching, developmental strategies, economics, strategic decisions, and tactical decisions. We solicited the opinions of a dozen or so of the most prominent sabermetricians, to revisit Woolner’s problems with the
benefit of ten additional years of research. While there is consensus on many subjects, confusion surrounds a few notable questions. Of course, each respondent has her own standard of proof, but the results hint at a divide between those in the public domain and those working for clubs.
Woolner’s first problem, about separating pitching and defense, has probably attracted the most attention, and its placement on his list suggests that it may have been first on his mind at the outset. But while this question garnered interest among all whom we surveyed, some considered the problem to be still open while others thought it had been satisfactorily resolved. Mitchel Lichtman (in part due to UZR, but also John Dewan’s qualitative identification of good and bad plays) believes that we already understand 90 percent of fielding and that FIELDf/x will give us the rest.
71
In contrast, Bill James, as we saw above, remains skeptical of this. Our own view is that, for reasons outlined above, there is still a tremendous amount of uncertainty surrounding the defensive evaluation of individual players. And while FIELDf/x will likely address some of the major unknowns in the current methodologies (e.g., the starting position of each fielder), it is hard to imagine that defensive evaluation will not be an active area of research for the foreseeable future.
Of equally general interest was Woolner’s tenth problem: projecting minor league pitchers accurately. Here again, while Lichtman and Tom Tango consider this problem to be exceptionally difficult, one club official considered it “reasonably addressed.” Lichtman hints at a potential reason for this discrepancy: “There is only so much you can do with the numbers. Of course insiders who ‘know’ or have access to people who ‘know’ these pitchers can or at least should do a lot better than an outsider who only has access to the numbers. Greater strides will and can be made in this area if and when front office analysts work more with the on-field personnel.”
72
Lichtman’s larger point is that sabermetricians working for clubs are likely the only ones with access to critical information about the pitcher’s repertoire, work ethic, mental discipline, health, and so on. The notion that sabermetrics and traditional player development information can be combined into analysis that helps not only the front office, but also the players, represents the current state-of-the-art philosophy. It is unfortunate, but perhaps inevitable, that lines of inquiry that were started by enthusiasts for the edification and
enjoyment of a small community may be nearing their fascinating conclusion under lock and key.
The world that Michael Lewis presented in
Moneyball
contained a few brilliant members of the Oakland A’s front office using their intelligence to outwit their competitors. Their strategy was based on sabermetrics, a discipline new to many, though it had originated roughly a century earlier and been cultivated by dozens of earnest collaborators. Without the implication that sabermetrics is real and works in practice, the
Moneyball
story might just be about an iconoclast who got lucky. But in the name of telling a good story, the details of sabermetric theory are sparse in the book and nonexistent in the movie. We have attempted in the last two chapters to rectify these oversights by presenting the basics of sabermetric thought in a coherent and accessible fashion.
Baseball was the first professional team sport in the United States. It was also the first sport to introduce collective bargaining and free agency in the players’ market. And, it was the first sport to spawn the use of critical analytics to assess player performance and game and franchise business strategy.
The other team sports have always followed baseball, and the case with analytics is no different. Baseball, of course, lends itself to the use of statistical analysis to evaluate player performance, among other reasons, because it is easier to isolate the productivity of individual players in baseball.
1
This is true both because the success of a player depends directly on the actions of only one or two opposing players, and because there are a limited number of discrete outcomes from a play in baseball (as well as discrete possibilities leading up to a play). Hockey, basketball, soccer, and football plays are more interdependent and more continuous, and measurement of individual performance productivity is confounded by a host of complications.
Nonetheless, especially after the publication of
Moneyball
, the development of analytics in other team sports has accelerated. Interestingly, unlike in baseball, analytics practitioners in the other sports emerged roughly at the same time as the interest in analytics surfaced in team front offices. In baseball, the hobby of sabermetrics surfaced decades before the profession of sabermetrics. Thus, from the 1970s (or earlier) a growing group of intellectually curious stat heads began communicating with each other, trying to parse the myths and mysteries of the game. Later, in the 1990s, following the lead of
Sandy Alderson, Larry Lucchino, Dan Duquette, and others, the avocation of sabermetrics became the vocation of sabermetrics.
This succession in baseball meant that the first two decades of dialogue and insights occurred in broad daylight, in the public domain. In contrast, basketball and football analytics was quickly absorbed in the teams’ front offices, and much of the statistical work in these sports has proceeded as proprietary, in the realm of commercial secrets.
Another obstacle has confronted statistical analysis in basketball and football. The best measure of performance is not total output, but total output per number of attempts, or efficiency. In baseball, measuring runs (or some other output) per inning is straightforward. In basketball or football, measuring points per possession is less so. In part this is because offense and defense can run together; that is, the ball can move in both directions on the same play.
According to some basketball metricians, settling on a consistent definition of a possession is necessary before a reliable measure of efficiency can be developed. So, for instance, if we want to determine the value of taking a shot in basketball, we must know both the number of points a successfully executed shot brings and the cost of either making or missing the shot (one leads to a change in possession and the other may lead to a change in possession). To know this, we must estimate the value of a possession—that is, how many points result from an average possession. But to know this, we first have to know how many possessions a team has in a game. This number is connected to the pace of the game and it is not as easy to quantify as one might imagine.
Let us assume that we can measure the number of possessions and that the average possession yields one point to the offensive team. In this case, when a player sinks a three-point shot, the shot gives the other team possession and, hence, one point on average. So, the expected net value of the three-point conversion is two points. (This observation is mitigated by the existence of the shot clock in professional basketball which mandates that the ball will change possession every twenty-four seconds, whether or not a shot is taken and/or made. Thus, a three-point shot at or near the buzzer does not really
cause
a change in possession.
2
Although there is no possession clock in football, football shares the characteristic with basketball that a score
necessarily leads to a change in possession.) A similar netting out procedure would not have to be performed for a hit (or run) in baseball, because the hit (or run) does not bring that inning closer to an end. It would, however, have to be assessed in reference to an out.
Another thing that aids quantitative analysis in baseball is the large sample size. Teams play 162 regular season games per year, roughly twice the number in basketball and hockey, four times the number in soccer, and ten times the number in football. Therefore, most annual statistics in baseball necessarily experience less random fluctuation.
3
In sum, because of (a) the less open nature of analytics outside baseball, (b) the greater statistical conundrums outside baseball, (c) the less discrete and more collective nature of football, basketball, hockey, and soccer, and (d) the smaller sample size for most statistics, the analytic insights from quantitative study in these sports are either more inchoate or more elusive.
Similar to baseball, it is not a simple matter to identify when the application of critical statistical analysis of performance and strategy was first introduced into the front offices of football, basketball, or other sport teams. Some teams did hire statisticians decades ago, but whether these employees did more than assemble the basic box scores and maybe add a few tweaks, and how any of this information was put to use, is more difficult to discern.