We all know how new statistics are born. The professional baseball quants get together in a back room and devise new ways to eliminate those dreaded traditional scouts, while their computers make all the decisions. But this begs the question, how do stats die? Well the first step, in my mind, is education. So it’s time we all learn about one of the most wretched statistics in baseball: the save.
How to Save the Day
The MLB rules state that a save is earned by any relief pitcher who finishes the game for the winning team. However, the pitcher who earns the save cannot also receive a win (which brings up another awful statistic, the pitcher win, but that’s a story for another time). So far this seems somewhat reasonable, but there are three important criteria that must be considered when evaluating the possibility of a save. Here is the wording directly from MLB.com:
“A relief pitcher recording a save must preserve his team’s lead while doing one of the following:
- Enter the game with a lead of no more than three runs and pitch at least one inning.
- Enter the game with the tying run in the on-deck circle, at the plate or on the bases.
- Pitch at least three innings.”
Essentially, a pitcher has to come in during a “close” game and pitch a full inning, come in during a really close game and record the last out, or pitch three innings.
So What’s the Problem?
The biggest issues with the save are two-fold. On one side is the fact that saves simply are not a good mechanism to judge pitcher performance. The other side is the human nature of coaches and general managers. When a pitcher puts up huge save numbers, it’s like a siren song to some front offices, and can mislead them to either invest in the wrong pitcher, or use their best pitcher in the wrong situations.
To start, let’s examine how saves correlate to other common pitching metrics. In each graph below there is a linear model included to approximate how saves are related to the alternate statistic. This includes an r-squared value. Basically, the r-squared value tells you how much the variation of one variable is explained by the variation of another variable. An r-squared value close to 1 means that the two variables are extremely related, and if a change is seen in one variable, there will likely be a proportional change seen in the other variable. Likewise, an r-squared value close to 0 means that a change in one variable is not a good predictor of whether the other variable will change.
The data I will be using is from all relief pitchers with 10 or more saves from each season since 2000.
FIP, or Fielding Independent Pitching, is a measure of a pitcher’s performance based only on what he can control. The performance of the defense is not taken into account, making FIP a more accurate (though still incomplete) representation of a pitcher’s performance than ERA. You can read more about FIP here. Below is the plot of single-season FIPs versus the total number of saves these relievers had in that season:
Although we cannot value a pitcher solely on FIP, it’s easy to see that having a lot of saves does not necessarily mean that a pitcher performed well according to his FIP. There’s a slight downward trend, but an R-squared value of .0853 means that only about 8.5% of that variation in the number of saves can be predicted by a pitcher’s FIP. The red data points on the plot indicate those pitchers who had a better-than-average FIP, but also had a measly number of saves (less than 20). These pitchers performed excellently, but would be overlooked if judged on saves alone.
Let’s look at two more metrics: WHIP and LOB%.
WHIP, or Walks Hits per Innings Pitched, is essentially the average number of baserunners a pitcher allows in an inning.
Again, those pitchers who allow very few baserunners will not necessarily have a high number of saves, as seen by the red data points. And after all, isn’t the job of a relief pitcher, especially a closer, to allow as few baserunners as possible?
LOB%, or Left On Base percentage, is the number of baserunners that the pitcher let reach, but then stranded.
Once again we see a significant number of pitchers who strand a higher-than-average number of baserunners, but come away from the season with low save totals. Many people would be hard pressed to call those pitchers “elite closers.”
All of the above plots have R-squared values less than 0.1, meaning less than 10% of the variation in the number of saves can be predicted by more adequate measures of performance like FIP, WHIP, and LOB%. Again, the red data points in each plot represent great pitchers according to our “better” metrics, but they all had a poor number of saves. (Fun fact: that dot at 100% on the “LOB% vs Saves” plot is Huston Street’s 2013 season with the Padres. He had a 99.5 LOB% and 33 saves. Not bad.)
Pump the Brakes
The more sabermetric savvy folks out there might be saying “Hang on, this guy is trying to compare a cumulative statistic with a rate statistic. Maybe a pitcher has a low number of saves simply because his team doesn’t give him as many save opportunities.” To which I would say, yes, that’s true, but doesn’t that demonstrate the issue?
When we judge a closer’s performance based on how many saves he has, we are actually judging his team on how good they are at playing into save opportunities. The truly silly part is that a really good team could have just as few save opportunities as a really bad team. Since a team can only be up by three or fewer runs in order to be in a save situation in the ninth inning, those teams that consistently blow everybody out by large margins will have less save opportunities. A team has to be good, but not too good, in order to have the “best” closer with the “best” save numbers.
The Human Factor
This leads us to the second part of our two-fold conundrum: the nature of coaches and general managers. When a pitcher gets tagged as a “closer” this seems to flip a switch in many people’s minds that he can only pitch the ninth inning and only in a save opportunity. Only on the rarest occasions shall Mr. Closer play in a different situation. However, which situation would you want your best reliever in the game: a one-run lead in the eighth, or a three-run lead in the ninth? Many coaches would not dare trot out their One-Inning-Wonders during the seventh or eighth innings, no matter how close the game is, because there’s an off chance that there might be a save opportunity in the ninth. Not to mention the fact that a guy can give up two runs in the ninth inning and still earn a save, but I don’t see anybody applauding starters, or even middle relievers, that give up two runs in one inning of work.
Another atrocious feature of save totals is that there is no predictability from season to season of how many saves a pitcher will have. This is especially true when a pitcher changes teams, since he may be on a team that encounters fewer save opportunities. Take, for example, K-rod’s career save totals by year (minus a couple of years where he didn’t record enough saves to make our cut):
In 2008, Rodriguez set the single-season save record (that far right dot on each of the above plots). After that he signed a $37 million contract with the Mets, and yet his save total decreased by 43%, which is what we in the biz call “regression to the mean.” In 2011 he was traded to the Brewers, and lost the closer position to John Axford (all 23 of his saves that season came on the Mets). In 2012 and 2013 he rarely pitched in save opportunities, but in 2014, he once again became “the closer.” But did Francisco Rodriguez, as an elite reliever, really change all that much over those few years? Or did the team that was playing behind him change?
Maybe K-rod was just really feeling it in 2008 and 2014. Maybe he was secretly really hurt in the years in between. Or maybe the teams he played for weren’t always good at being up by precisely 1 to 3 runs in the ninth. Maybe the save statistic is stupid. Maybe it’s ruining relief pitching.
It’s Up to Us
At its core, the save tries to compliment a pitcher’s guts and fortitude. That’s a noble pursuit, but when a team is trying to decide how many millions of dollars a relief pitcher is worth, or if a certain pitcher should only be The Ninth Inning Guy, I think we should stay true to our data.
So, coaches, general managers, loyal fans, stop paying attention to saves. Ignore the baseball-reference reports and the column on the back of your pitchers’ baseball cards. Play your best pitcher when they’re needed, not just in the ninth. Quit it with this talk of “closers.” And please, for the love of all that is holy, do not vote a pitcher into the hall of fame based on save totals. We aren’t up by three runs, the tying run isn’t on deck, we haven’t even pitched three innings, but together we can save the game of relief pitching, by killing the save.