Wednesday, March 26, 2008

How I Learned to Stop Worrying (about ERA) and Love RA

I hate unearned runs. More specifically, I hate their arbitrary separation from earned runs. In the humble opinion of this excellent actor, they are the worst stat in baseball today. Sure, a stat like wins is terrible and often very misleading, but anyone that has ever thought about wins for more than a second realizes their inherent flaws. When arguing about pitcher quality, anyone who wants to make an intelligent comparison focuses on runs allowed, innings pitched, strikeouts, things of this nature. But everyone uses ERA as their quick and easy pitcher comparison stat. It's universally accepted. In comparing pitchers, you look at ERA, adjust it for the home park and/or league, account for innings pitched, and voila, you know which pitcher is better. Except the decision to use ERA instead of RA (runs allowed, which includes earned and unearned runs) could create a serious flaw in the conclusion.

First, the case against separating earned and unearned runs. As anyone who has ever watched a baseball game knows, errors are very arbitrary. Hometown scorekeeping frequently skews the awarding of errors, so that home team is more likely to get hits than errors. Plays where a fielder gets a terrible jump on the ball and doesn't come close to it are scored as hits, while a harder play where the fielder gets a great jump and ranges very far but bobbles a ball is scored an error. Fielding mistakes by outfielders are rarely scored as errors, while most mistakes by infielders are ruled errors, at least those that aren't a result of a lack of range. This unfairly penalizes fly ball pitchers, as they're deemed to be responsible for more of their runs than ground ball pitchers. Finally, pitchers are actually responsible for most of the unearned runs they allow. That an error helped prolong the rally doesn't excuse the other hits allowed by the pitcher that allow the rally to continue. Attempting to adjust ERA for the quality of the defense is a good idea, but the simple use of errors is a very flawed way to do this, in the same way that the use of errors is a very flawed way to evaluate defense.

So, how does this affect our evaluation of pitchers? Well, the comparison that sent me off on this kick is that of Brandon Webb versus Jake Peavy. They've had extremely similar careers to this point: Webb has 1089 IP with 390 ER, for a 3.22 ERA, Peavy has 1087 IP with 400 ER, for a 3.31 ERA. So even before adjusting for ballpark, Webb appears to be the superior pitcher. Once you adjust for park, Webb ends up with a 144 ERA+ and Peavy with a 119 ERA+. It's apparently not even close. But wait, Webb is an extreme groundball pitcher, and as such gives up many more unearned runs than Peavy. That's not a function of Webb's defense, that is a result of the inherent pitching ability of Brandon Webb and the way he attacks hitters. Regardless of how good his defense is, he's going to allow more unearned runs than Jake Peavy over the long haul. As such, the unearned runs must be included in any analysis of his pitching. So if you include those, suddenly Webb has allowed 456 total runs, versus 427 total runs allowed for Peavy. After adjusting for park, Webb is still better, but it's certainly closer than it appears from a cursory glance at ERA or park adjusted ERA.

Use of unearned runs in a pitching analysis makes a big difference in looking at the NL Cy Young race last year. Myron at Friar Forecast took a look at the value of Peavy and Webb last year and concluded that if we ignore unearned runs, Webb was actually slightly more valuable than Peavy last year (the extra innings and more difficult ballpark to pitch in outweigh Peavy's more impressive raw ERA). However, if you re-run in the analysis as Myron does in the comments section (prompted by me, actually) to account for the unearned runs, it tilts the scales definitively in Peavy's favor. Campaigning for Peavy in this context was actually what set me off on my anti-earned/unearned runs crusade.

There are some other interesting comparisons to look at through the lens or earned vs. total runs allowed. Last year Greg Maddux put up a 4.14 ERA in the cavernous Petco Park, while Derek Lowe compiled a 3.88 ERA in the neutral Dodger Stadium (yes, Dodger Stadium is basically neutral, perhaps even slightly favoring hitters). They pitched 198 and 199 1/3 innings, respectively. Lowe threw slightly more innings with a lower ERA in a more hitter friendly park; it seems like a slam dunk that he would have been more valuable. So why does VORP (a measure of a pitchers value compared to a generic freely available replacement, which adjusts for park, league, and yes, uses total runs instead of earned ones specifically) say Maddux was worth 5 runs more last year than Lowe? I'll give you a hint: it's the subject of this entire post, and the last comment in my parenthetical explanation of VORP was something of a spoiler. Yes, that's right, Maddux allowed only a single unearned run last year, while Lowe allowed a whopping fourteen. It's funny how the earned/unearned run split colors our perception. Lowe is viewed as a very good number three pitcher, while Maddux is considered more of a number four guy and considerably shakier. And yet, Maddux was better than Lowe last year.

One final comparison, just for fun. Matsuzaka had a 4.40 ERA last year in a season considered mostly a disappointment. However, he didn't allow a single unearned run. Oliver Perez posted an excellent 3.56 ERA, but he allowed twenty (!) unearned runs. That is definitely an indication the Mets defense was shaky behind him but Perez certainly bears some portion of the blame for those runs. If 20 of Matsuzaka's runs were converted to unearned runs, perhaps by moving 8 errors committed by the Red Sox into 8 of Matsuzaka's bad innings, suddenly he's got a 3.52 ERA and is celebrated as a huge success story. If all of Oliver Perez's unearned runs were earned, maybe the Mets' scorekeeper doesn't like to award errors ever, then he's got a questionable 4.58 ERA. Suddenly the Pirates organization doesn't look like quite such an epic failure for trading Perez back away for peanuts. Sorry, Pirates, most of your other epic failures can't be explained away by looking at unearned runs. The point of this hypothetical is that small changes that have little to do with how Matsuzaka or Perez pitched would cause massive changes in their ERAs and in how they're perceived.

The moral of this story: errors are arbitrary, and bad defense happens to everyone, regardless of their ERA/RA split (RA is ERA but with all runs allowed included). When evaluating past performance of pitchers, RA is a better tool than ERA.

As a postscript, this is an interesting article by David Gassko looking at ground ball pitchers. He confirms that groundball pitchers do allow more unearned runs than other pitchers. 85% of errors occur on groundballs. This article by Michael Wolverton makes the same case I'm trying to make here, only using actual numbers to back it up: that preventing unearned runs is a skill just like preventing earned runs. Wolverton summarizes it succinctly, "Errors will happen. Good pitchers will minimize the damage caused by them. That is, a good pitcher will allow fewer runners on base before the errors happen (so there aren't runners to score on the errors), and will allow fewer hits and walks after errors happen (so the runners who reached on errors won't score)." He finds that, yes, pitchers good at preventing earned runs are also good at preventing unearned runs in general. If he had adjusted for groundball rate, presumably this correlation would have been even stronger.

2 comments:

Anonymous said...

Is this Ben B. (I assume it is from reading the above post : )? You should have mentioned that you started a blog. Anyway, I'll dedicate a whole post to this blog at some point on mine, as I'm thrilled to be able to read more of your thoughts (seriously, I am ...) and I'm sure others would be too.

That's unless you want to stay hidden in internet obscurity like I occasionally do. Wait, even if that's the case, it will not matter whether I link to you or not : )

Anyway, I agree with your point here. I still probably use ERA too much due to general laziness, but I'm going to try to go with RA more often. There's really no reason not to. I mean, an error is usually supposed to be an out; Then again, a blooper that falls in may be an out, say, 75% of the time (but if so and so lets it drop, it won't be scored an error, like you mention).

Anyway, great stuff.

Anonymous said...

Hey, Ben. This is cool.

I have a question. Does looking at WHIP helps? Or Babip against?