Read The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball Online
Authors: Benjamin Baumer,Andrew Zimbalist
Unfortunately for Lewis, his prescience about the 2002 draft is almost universally off-base. He writes: “The selections made [in the 2002 draft] are,
from the A’s point of view, delightfully mad. Eight of the first nine teams select high schoolers. The worst teams in baseball, the teams that can least afford for their draft to go wrong, have walked into the casino, ignored the odds, and made straight for the craps table.”
29
In reality, only six, not eight, of the first nine picks were high schoolers, and of those six, three went on to become future stars (B. J. Upton, Prince Fielder, and Zack Greinke).
30
The first overall pick, Bryan Bullington by the Pirates, turned out to be one of the worst in history, despite Lewis’s remark that “at least [he’s] a college player.”
31
In retrospect, the first nine selections of the 2002 draft serve as a reminder of why teams draft high school players: because that’s where the top talent is most often available.
32
It warrants mentioning that Bill James has long held that the level of play is too low and too uneven to be able to make much out of performance statistics in intercollegiate baseball. To be sure, college stats generally mean more than high school stats, but that is different than saying one can lean on them entirely to make draft decisions. James did develop major league equivalency conversions between different levels of minor league ball and the majors, but we believe these to be incomplete and imperfect gauges of major league potential. At best, they project what happens to the average minor league and average major league player. A useful analogy here is between a Ferrari and a Corolla both driving on an interstate at 70 miles per hour. Would this mean that in an open road race we would predict the two cars would finish in a dead heat?
In any event, Beane seems to have changed his tune on the desirability of high school picks: between 2002 and 2011 the A’s had twenty-two first-round picks and all of them were college players; in 2012, their three first-round picks were all high school players.
Lewis’s storytelling works better with simplicity and with heroes. Billy Beane was a likely suspect—a former highly touted ballplayer, good looking, and smart enough to get a baseball scholarship offer from Stanford. Another hero was Bill James, folksy, rambunctious, a gifted scribe, and self-made man who
came along at the right moment (the advent of free agency and exploding player salaries, along with the maturation of the computer and the Internet). James played an important role in popularizing the use of statistics to provide new insights into understanding the value of player performance and game strategy.
The real history of baseball statistics, however, is a good deal richer and more involved than Lewis would have us believe. While Lewis makes a fleeting reference to antecedents to the work of Bill James, he does not acknowledge either the significant intellectual development of such thought or the fact that it was put into practice in the major leagues at various times. The reality is that Bill James never claimed to invent baseball analytics, and he didn’t. Many of James’s insights were developed years earlier by others. Nonetheless, James did suggest the term “sabermetrics,” advance the sophistication of statistical analysis, and help significantly to spread its practice.
The initial introduction of statistics for analyzing the game can be traced to Henry Chadwick, who pioneered the box score (such as it was) back in the 1860s. In 1872, Chadwick proposed a measure, similar to Bill James’s “range factor,” as the proper way to assess fielding prowess.
33
In Chadwick’s day, walks were uncommon (depending on the year, between five and nine balls were necessary for a walk), so it would not have made sense to emphasize on-base percentage. But Chadwick did emphasize the importance of not making an out, as opposed to the importance of getting a hit—a prominent feature of modern sabermetric analysis.
At the beginning of the twentieth century, baseball writer F. C. Lane denounced batting average as a meaningless metric, calling it “worse than worthless” and signaling the importance both of walks and extra base hits. In 1916, he penned the question: “Would a system that placed nickels, dimes, quarters and 50-cent pieces on the same basis be much of a system whereby to compute a man’s financial resources?”
34
Lane tracked the play-by-play of sixty-two games and found the following run value to hits: singles = .457 runs; doubles = .786 runs; triples = 1.15 runs; and home runs = 1.55 runs. A single not only meant that a batter reached first base, but it also meant that any existing runners advanced a base or more, and it provided the possibility for at least one additional batter to be up that inning. Similar reasoning
applies to extra base hits and, thus, a home run on average increased the number of runs scored per inning by roughly 50 percent more than 1. This analysis by Lane, from 1917, was a clear precursor to James’s metric of Runs Created and the concept of linear weights.
While GM of the Brooklyn Dodgers, baseball innovator Branch Rickey, hired Allan Roth in 1944 as the first team statistician in baseball.
35
Roth’s work convinced Rickey to use OBP as the basis for evaluating a batter’s talent, and his analysis was instrumental in convincing the Dodgers to trade Dixie Walker as well as to bat Jackie Robinson in the cleanup position. Based on Roth’s theories, Rickey published a ten-page article in
Life
magazine in 1954 that argued for a new way to assess baseball performance. The article came complete with a complex mathematical formula whose first two terms basically represented OPS (Offensive Performance Statistic = OBP + SLG), another staple of the modern sabermetric toolbox.
36
Whether inspired by Roth and Rickey or not, Casey Stengel displayed his own adumbration of moneyball when he periodically wrote slow-footed, but high on-base guys, Norm Cerv and Elston Howard, into the leadoff spot on the lineup card.
The intellectual side of baseball analytics was given another boost in the 1950s and early 1960s by the work of George Lindsey. Lindsey developed an early version of the run expectancy matrix (discussed in the Appendix). Following the statistical work of Lane, Lindsey found slightly different run values or weights for hits (singles = .41, doubles = .82, triples = 1.06, home runs = 1.42). He also estimated the positive effect of platooning and the uncertain value of sacrifice bunts and stolen bases.
37
After Lindsey came Earnshaw Cook. In 1964, Cook published his classic
Percentage Baseball
, the first book-length treatise on applying statistical analysis and probability theory to game strategy and player performance. Cook’s volume and its sequel yielded many insights (as well as some wacky strategy suggestions), but, unfortunately, used idiosyncratic and abstruse mathematical notation that made it difficult to understand even for math Ph.D.s. Among other things, Cook developed a concept he called the Scoring Index that emphasized the importance of on-base percentage and slugging. Preeminent sportswriter Frank Deford discovered Cook’s work and wrote a piece about
him in
Sports Illustrated
.
38
According to Alan Schwarz, Astros GM Tal Smith, Orioles, Mariners, and Red Sox GM Lou Gorman, and manager Davey Johnson all read Cook or were made aware of his theories.
39
Cook’s work was followed by Harlan and Eldon Mills, who published
Player Win Averages: A Computer Guide
in 1972. The Mills brothers, using data from the 1970 season, developed a program that estimated how much each player’s hits increased their team’s probability of winning the game and developed a parallel metric. Harlan and Eldon Mills were invited to meet with executives from the Yankees and the Mets, but they were never asked to do any formal consulting and there is no evidence that their metric was ever put into practice.
Earl Weaver was hired by Orioles’ owner Jerry Huffberger to manage the team in 1968.
40
Weaver had noticed during his playing days, entirely in the minor leagues, that there were some mediocre pitchers he couldn’t hit at all and some really good pitchers against whom he had great success. He didn’t know why. He just knew it was true and that the pattern kept repeating itself. When he became manager, he went to the Orioles’ public relations director, Bob Brown, and asked him to provide match-up data on note cards between the Orioles’ batters and the opposing pitchers. Soon, Weaver was also using note cards with data on opposing batters versus his pitchers. Weaver became famous for using opposite-handed platoons. Over time he began to develop other theories and tested them. He grew suspicious of the sacrifice bunt and stolen base, and came to be associated with the cliché “a walk, a blooper, and a three-run homer.”
41
Weaver’s emphasis on not wasting outs anticipated one of the best known sabermetric mantras.
As Weaver entered the twilight of his managerial career, other teams started to take notice of the strategic role of statistics. In 1979, Tal Smith, GM of the Houston Astros, hired sabermetrician Steve Mann to do statistical analysis. Around 1980, the San Francisco Giants hired statistical analyst Eric Walker. Walker attributes his interest in baseball analytics to reading
Percentage Baseball
by Earnshaw Cook. In 1981, the Texas Rangers hired Craig Wright, whose business card read “Sabermetrician.” In 1982, Sandy Alderson, then GM of the A’s, hired Walker to do statistical reports for the team. Walker continued to work with Alderson into the 1990s and wrote the pamphlet that Alderson used to indoctrinate Beane into sabermetrics.
42
The late 1970s and early 1980s, of course, were the years when Bill James was writing his yearly
Baseball Abstract
, and Pete Palmer and John Thorn were helping to spread the sabermetric gospel through their 1984 classic
The Hidden Game of Baseball
. Palmer had been doing sophisticated statistical analysis of baseball since the 1960s, but it wasn’t until he hooked up with John Thorn, and Bill James had aroused general interest, that Palmer’s linear weights model and other insights found publication.
43
Soon after Dan Duquette became GM of the Montreal Expos in September 1991 he hired statistical analyst Mike Gimbel. During the early 1980s, Duquette had worked as an assistant scout for the Milwaukee Brewers and had met Dan Okrent, who was working on his book
Nine Innings
, about one game of the Brewers’ 1982 season. Okrent turned Duquette on to the emerging literature in sabermetrics that eventually led to his hiring of Gimbel. When Duquette went from Montreal to the Red Sox in 1994, he took Gimbel with him.
Meanwhile, Larry Lucchino, GM of the Baltimore Orioles at the time, hired Eddie Epstein in 1986. When Lucchino went to the Padres in 1995, Epstein went with him.
All these hires of statistical analysts in the 1970s, 1980s, and 1990s, of course, predated Beane’s hiring of Paul DePodesta in November 1998, yet Lewis sets DePodesta’s employment with the A’s as marking a watershed for sabermetrics. Lewis does recognize in passing some of the earlier hires, but he dismisses their significance, claiming that they were outside the decision-making apparatus and had “cult” status.
While it is true that DePodesta had a more significant role than his predecessors, those who came before him were not marginalized and insignificant. As we have seen, Eric Walker had a profound effect not only on Sandy Alderson but also on Billy Beane. Alan Schwarz avers that Walker had a role in creating the world champion A’s teams during 1988–1991.
44
Duquette states that Mike Gimbel was involved in giving player acquisition advice, and Lucchino says that Eddie Epstein participated in meetings about personnel issues.
45
Apparently, Epstein’s advice was instrumental in the Orioles acquiring Brady Anderson. Epstein urged the move because of Anderson’s high walk rate.
46
Given the long intellectual and practical history of statistical analysis guiding player moves and strategy in baseball, and the uncertain impact that
sabermetrics had on the A’s success in the early 2000s, the question arises: Did Billy Beane innovate in this field?
Our answer is that while Beane significantly extended the application of sabermetrics, he did not innovate in its use. Beane also extended the use of “moneyball”, understood to mean the specific application of sabermetrics in order to identify market inefficiencies (bargains), but here too previous GMs also combed new statistics to find undervalued players. Indeed, Sandy Alderson himself was seeking bargains for the A’s even while Wally Haas was still the owner prior to 1995.
Three other matters from
Moneyball
cry out for discussion: the inconsistent characterization of Beane’s methods; the accuracy of the implied predictions in the Lewis approach; and, the role of moneyball in addressing the competitive balance problem in major league baseball.
The subtitle of
Moneyball
is “The Art of Winning an Unfair Game”; the title of chapter six is “The Science of Winning an Unfair Game.”
47
Is it an art or a science? Lewis can’t have it both ways.
Lewis is clear from the beginning that “reason, even science, was what Billy Beane was intent on bringing to baseball.”
48
The confusion continues when Lewis enumerates Beane’s five simple rules for success. Rule 3 is: “Know exactly what every player in baseball is worth to you. You can put a dollar figure on it.”
49
This rule, along with the other four, suggests that there is something systematic, if not scientific, behind Beane’s practice of moneyball.
50
As we will discuss in later chapters, the notion that one can know “exactly” what a player (let alone every player) is worth is fanciful in the extreme. Yet, since, according to Lewis, Beane actually tries to do this, his method must be deemed to be systematic. (Others might call it delusional.) However, on the same page, Lewis opines, “His [Beane’s] approach to the market for baseball players was by its nature unsystematic. Unsystematic—and yet incredibly effective.” Eight
pages later, Lewis writes of the Giambi-for-Mabry trade: “Billy hardly knew who Mabry was.” And writing about Beane’s decision not to go to the Red Sox, Lewis declares: “Billy confined himself to the usual blather about personal reasons. None of what he said was terribly rational or ‘objective’—but then neither was he.”
51