The Baseball Economist: The Real Game Exposed (23 page)

BOOK: The Baseball Economist: The Real Game Exposed
11.81Mb size Format: txt, pdf, ePub
The stats I’m interested are the Big 3: AVG, OBP, and SLG. When evaluating players, it’s important to know which characteristics matter the more than others. As I’ve discussed, properly valuing skills, when other teams do not, can pay big returns. In evaluating the run-producing impact of these metrics using linear regression analysis, my time period of analysis is the era following the most recent expansion (1998–2004). This is so we are evaluating players playing in the same level of competition. Regression analysis tells us many things, but initially we ought to be concerned with how close each metric predicts the runs per game. We can then attribute the difference in runs per game across teams based on differences in the values of the different metrics. Figure 10 plots the relationship between the AVG, OBP, and SLG and runs per game by team. Each point represents the runs per game scored by a team and the associated value of the statistic in question. The slope of the regression line represents the estimated relationship between the metrics and runs that minimized the squared error of the prediction.
An obvious trend reflected in all of these graphs is a strong positive correlation between the offensive statistics and runs scored per game. A positive correlation means the metrics move in the same direction— when AVG goes up, runs go up; when AVG falls, runs go down—and a negative correlation means the metrics move in opposite directions. The higher a team’s AVG, OBP, or SLG, the more runs the team is expected to score, which is not surprising. What is surprising is the proximity of the predicted runs of the regression lines to the actual runs scored by teams. Notice that the actual data points are farther away from the regression line for AVG than for OBP and SLG. The closer the points are to the prediction, the better the “fit” of the data to the prediction, yielded by the offensive stat.
While the graphs provide a nice guide, we can quantify exactly how much better OBP and SLG are at predicting run scoring from the regression estimates. Table 23 reports the difference in runs scored by teams that is explained by difference in the offensive metrics by league. We need to generate estimates for both leagues, because the use of the designated hitter in the AL may cause the estimates to differ. The
differences in AVG across teams explain between 64 percent and 72 percent of the difference in runs scored per game, while OBP and SLG explain 80–85 percent. In other words, AVG tells us 8–21 percent less about a team’s run scoring ability than OBP or SLG. This means that OBP and SLG are better predictors of run scoring than AVG.
In one sense, this is not surprising; OBP captures a team’s ability to get on base via hits (like AVG),
plus
walks, and getting hit by pitches. SLG is just a batting average that weights the power of the hits; it’s an AVG with kick. The reason one might consider it surprising is that the one statistic that everyone in baseball seems to focus on to judge offensive prowess is the batting average. But it’s still too early to give up the search. Maybe AVG does tell us something important beyond OBP and SLG. What if we looked at all of the metrics at the same time for each team? It’s possible that AVG could have an a greater
marginal
impact on generating runs—the impact of each additional batting average “point” being more valuable to run production than additional points of OBP and SLG.
Using the same sample of teams, we can estimate the impact of all of the metrics on runs per game at the same time with multiple regression analysis, which has some distinct advantages over the single variable regressions discussed above. Figure 11 shows a much tighter relationship between the predicted runs and actual runs than any of the metrics on their own, explaining approximately 92 percent of the variance in runs scored. But the story doesn’t end here. The results indicate that the statistics have different impacts on run scoring.
Table 24 lists two types of measured impact of each statistic on runs per game: the direct marginal impact and the elasticity. Because all of
the statistics are included in the same regression model, we can calculate the added impact of each additional point of AVG, OBP, and SLG.
Elasticity
is also a marginal impact measure, but it interprets the effect in percentage terms at the average value for runs per game and the statistic. For example, the elasticity of 1.22 percent for NL OBP when all of the metrics are included means that a 1 percent increase in OBP is associated with a 1.22 percent increase in runs per game. This metric is useful for comparing the sensitivity of the impacts in the same terms.
The results are even more striking than before. AVG has a very small impact on runs per game compared to OBP and SLG. And in the case of the American League the impact is actually negative, meaning additional points of AVG are associated with lower runs scored. But note two things. First, the estimate of AVG is not “statistically significant,” which means we cannot say with a high level of confidence that the AVG has any effect on scoring runs beyond the information we gather from OBP and SLG. Second, AVG seems to have zero explanatory power in predicting runs. Table 25 lists the variance explained with all of the metrics in the regression model and with only OBP and SLG and not AVG. The inclusion of AVG does not impact the variance
explained in a regression model, which means it doesn’t add any more predictive power than OBP and SLG alone. So it would be very wrong to interpret higher batting averages as lowering offense. Instead, the regression tells us that AVG adds no more information once we know the OBP and SLG of a team.
These numbers also identify an anomaly that the A’s exploited, which
Moneyball
documents. At the time, the conventional wisdom was that each point of OBP was worth between 1.5 and 1.8 times more than each point of SLG. The A’s found that, in fact, each point of OBP was about three times more valuable than each point of SLG. The marginal impact numbers from Table 25 for the American League, in which the A’s play, indicate that OBP is 2.33 times more important than SLG (21/9 = 2.33) in producing runs over this time period. While it’s not quite as large for this sample as what Lewis reported, when I exclude the 2003–2004 seasons, which occurred after
Moneyball
was written, the marginal effect of OBP is almost exactly three times that of SLG.
Despite the fact that OBP and SLG tell us more about a player’s offensive contribution, television commentators continue to post AVG at the bottom of the screen as each batter steps to the plate. OBP and SLG often make appearances, but they are secondary. Why? Habit.
Calculating AVG is very simple, and without the help of computers, generating an OBP and SLG was quite a taxing affair. The inertia of tradition can be strong, especially in baseball.
But if you want to buck tradition and look to statistics with more to say about producing, there are plenty of options. OBP and SLG are just the beginning of metrics sabermetricians have developed to predict offense. Some examples include runs created and linear weights. The list of new statistics is very long, and most of the metrics out there are quite good and accurate in their attempts to measure player contributions to producing runs. However, these metrics require several sophisticated calculations or a trip to several baseball statistics websites. If you’re watching a game, by the time you look up a player’s stats online, you’ve missed the whole half inning. Most of these stats are not “couch worthy”—I can’t calculate them without leaving the couch. But there does exist a simple shortcut for gauging the ability of a player. The solution is to add OBP and SLG together to create a metric conveniently known as OPS (on-base plus slugging), developed by John Thorn and Pete Palmer.
69
While it may seem odd and intuitively it doesn’t make much sense, OPS does a fantastic job at predicting runs per game. Compare the graph of OPS in Figure 12 to the graph of the Big 3 in Figure 11 in terms of predicting runs. They are nearly identical. Table 25 shows that the explained variance of OPS is about 90 percent, which is quite high. OPS is the sabermetricians shortcut, and I find it quite handy on the couch and at the ballpark.

Other books

The Last Princess by Cynthia Freeman
Vampire Lover by Linda Thomas-Sundstrom
Arizona Heat by Ellie J. LaBelle
None So Blind by Barbara Fradkin
Dreams of a Dark Warrior by Kresley Cole
Dead Trouble by Jake Douglas
Fancies and Goodnights by John Collier
Night Shift by Stephen King
Blood Rites by Jim Butcher