Table of Contents

Tuesday, November 27, 2007

Player Value, Part 5a: Pitchers

To view the rest of the player value series, click on the player value label at the bottom of this post.

Now that we (or, at least, I) have decided on how we'll evaluate the value of position players, it's time to turn our attention to pitchers. The goal will be the same as with hitters--estimate of a player's contributions to the team, reported in the currency of runs.

There's a sense in which this might be an easier thing to do with pitchers than with hitters. After all, one of the traditional ways in which we report pitcher performance--earned run average--is based on counting up the actual number of runs a pitcher allowed over the course of a season!

Absolute Runs Allowed Estimates

Let's start with the first step, which is getting an estimate for the total number of runs a pitcher allowed over the course of the season. Once we have that, we'll take a look at runs allowed vs. average and replacement. The easiest way to get the number of runs a pitcher allowed is to record the actual number of runs allotted to a pitcher based on our conventional scoring practices (below I'll refer to this as "TrueRuns"). However, I also want to think about two other alternatives.

Base Runs

Base Runs ("BsR") was introduced in the piece on Run Estimation for position players. With hitters, it's not appropriate to use the base runs equation to estimate runs created. Doing so assumes an interaction between the individuals' ability to get on base and that individual's ability to move runners around the bases. With pitchers, though, that assumption is entirely appropriate--a pitcher's ability to prevent baserunners directly interacts with his ability to prevent the advancement of those baserunners to determine how many runs are permitted to score.

Why would you want to use base runs instead of actual runs scored to assess player value? Well, if starters threw complete games every time out, I'm not sure that there would be a compelling reason. However, that's not what actually happens. Often, starting pitchers will leave a game in the middle of an inning with runners on base. Whether those runners score has to do with the performance of relievers. On a team with an outstanding bullpen, a large number of those inherited runners might not score. But in that case, I think we're overestimating the value of that starter because of the outstanding performance of the bullpen.

Similarly, the performance of bullpen pitchers may not be properly assessed by using straight runs scored. A reliever can come into a game with the bases loaded and two outs and allow three base runners before ending the inning, and yet still not be charged with any runs allowed--they all go to the starter. Clearly that's overestimating how much value the reliever is bringing to the team.

Base Runs allows us to get around those issues by assessing the typical value, in terms of runs allowed, of each event that happens while a pitcher is on the mound. Base Runs still retains the nonlinear way in which singles, doubles, walks, etc, interact to produce runs as they become more frequent. It certainly misses out on some of the context of when exactly those events happen with respect to one another. But given its accuracy across a variety of situations and conditions, and the ability to focus exclusively on what happens while a pitcher is on the mound (as opposed to the actions of other pitchers in the same game), it may be preferable to use BsR--especially when dealing with small sample sizes or exceptionally strong/weak bullpens.

I'll present two alternative equations. The first, which is the one I'm using in this article, uses data provided by Baseball Reference. These data are really awesome because for each pitcher, they include all the various statistics typically reported for hitters--singles, doubles, triples, stolen bases, etc. This allows me to use essentially the same equation for pitchers that I used for hitters, though in this case I'm also including Reached On Errors (ROE) in my equation. As with the hitters, I've forced this equation to predict runs scored in the '04-'07 National League. Here's "my" custom base runs equation:

BsR = A*(B/(B+C)) + D
where:
A = H - HR + NIBB + IBB + ROE + HBP + 0.08*SH
B = .829*1B + 2.224*2B + 3.578*3B + 1.872*HR + .059*NIBB + .912*ROE + .928*SB - 1.356*CS + 0.186*HBP - 0.551*IBB + .830*SH - 1.356*GDP - 0.005*nonKOuts - 0.065*K
C = 0.92*SH + nonKOuts + K
D = HR
and:
Outs = AB - H + SF + 0.92*SH
nonKOuts = Outs - K - 0.92*SH

(note: SH has to be removed from the nonKouts term because it is handled separately in parts B & C, but I like to include it in my outs term when estimating r/g later on. This is something I should have done with hitters, and I'll revise those equations shortly--I'm sure it's a very minor adjustment).


Note that if you only have traditional pitching statistics, there is an alternative version of the base runs equation that is designed for those stats that was originally devised by David Smyth. Here's a slight variation on that equation, with the B term "fudged" slightly to perfectly match MLB '03-'07 totals.

BsR = A*(B/(B+C)) + D
where:
A = H + BB - HR + HBP - IBB
B = -0.625*H + 0.104*BB - 3.123*HR + 1.457*eTB + 0.1*HBP - 0.1*IBB
C = IP*3
D = HR
and:
eTB = 1.12*H + 4*HR

This is the equation I used in my profile on Francisco Cordero because it only uses widely available pitching statistics, which are the most convenient stats to pull from most online player profile pages.

FIP Runs

Another alternative way to estimate runs allowed is to make use of Tom Tango's Fielding Independent Pitching (FIP) statistic. FIP is calculated as FIP = (13*HR + 3*(BB+HBP) - 2*K)/IP + X, where X is usually a constant to force league average FIP to equal league average ERA. However, if we're interested in estimating total runs allowed, rather than just earned runs allowed, we can instead force the equation to match league average runs/9 innings (RA). In 2007 NL, X = 3.54. We can then convert FIP, which is a runs/9 estimate, to a season-total runs estimate like this:

FIPRuns = FIP / 9 * IP

Classically, FIP tries to estimate ERA using statistics that are under the exclusive control of the pitcher (i.e. not influenced by fielders)--walk rate, strikeout rate, and home run allowed rate. The literature on defense-independent pitching statistics (DIPS) indicates that these statistics are much more repeatable from year to year than hit/ball in play rate. Therefore, when ERA deviates from FIP, this is usually (though not always) the result of either "luck" or the fielding behind the pitcher.

Now, from the perspective of player value, there's a sense in which a DIPSish estimate like this isn't particularly informative. In the case of a pitcher with an unusually high BABIP due to poor fielding or bad luck, even if they're not necessarily his fault, those hits did happen and therefore reduce how much value that player "contributed" to his team. However, the reason we're interested in value in the first place is, usually, to get some idea of how well players performed. Therefore, I think there's worth in using a stat like this as a compliment to (not a replacement for) our other estimates of value.

Ok, all that said, let's take a look at how these three measures of absolute runs allowed compare using the '07 Reds as a case study. Note that I have applied a park factor adjustment on these numbers. One could argue whether that's appropriate for FIP, but I went ahead and did it under the assumption that a HR park factor is a big part of the typical park factor:

Update (1/18/08)
: Recently, I have taken to calculating FIPRuns based on a HR-Park factor adjusted HR total for players or teams. I then do not adjust by overall runs park factor. The effect turns out to be largely the same. Given that k/9 and bb/9 are somewhat dependent on runs environment (more runs = more PA's per nine innings), I'm also not convinced it's necessarily a better option. But it somehow feels like the right thing to do.

Pitcher IP TrueRuns BsR FIPRuns
MBelisle 177.7 109.9 105.3 94.4
BArroyo 210.7 107.9 117.4 111.7
AHarang 231.7 99.0 96.4 101.0
KLohse 131.7 75.2 71.1 68.7
MStanton 57.7 38.6 37.4 29.5
KSaarloos 42.7 35.6 33.5 29.1
TCoffey 51.0 35.6 39.3 35.1
BLivingston 56.3 34.7 36.8 30.1
DWeathers 77.7 32.7 33.1 35.6
HBailey 45.3 31.7 26.2 25.7
PDumatrait 18.0 29.7 29.2 17.7
VSantos 49.0 27.7 30.8 31.6
GMajewski 23.0 21.8 20.3 12.6
JCoutlangus 41.0 21.8 21.1 21.9
MGosling 33.0 21.8 26.3 22.3
EMilton 31.3 20.8 21.9 16.8
TShearn 32.7 17.8 19.3 24.7
JBurton 43.0 14.9 10.7 19.4
ERamirez 16.3 13.9 12.4 14.2
MMcBeth 19.7 12.9 12.8 9.3
BSalmon 24.0 10.9 10.1 12.3
EGuardado 13.7 10.9 9.0 8.0
BBray 14.3 9.9 7.8 5.5
RStone 5.3 5.9 5.8 7.4
RCormier 3.0 3.0 3.0 3.0
I have to primary observations from this table. First, each of these estimates tell us pretty similar things about the pitchers. That's good, because they're all estimates of the same thing. True Runs and BsR, in particular, are pretty darn close for most pitchers, with a correlation of 0.995. Bronson Arroyo, for whatever reason, shows the biggest difference, with BsR showing him costing the Reds ~10 more runs than he "actually" did. Given how close they are, my tendency is to just use BsR--it avoids confounds with the performances of other pitchers, and has the advantage of following a very similar methodology to that we use for hitters.

My second observation is that these values give us very little indication of player value. The finding that Belisle, Arroyo, Harang, and Lohse allowed more runs than anyone else has everything to do with the fact that they got more innings than anyone else on the staff! Fortunately, there are three simple stats that we can calculate from these runs totals to get a better handle on player value: runs per game, runs above average, and runs above replacement.

Baselines for Pitching Value

Runs Per Game

RPG = Runs/IP*9
or
RPG = Runs/Outs*26.25

For true runs and FIP, I'm using IP. For BsR, I'm using Outs. But it's basically just doing the same thing.

Runs Above Average

RAA = (RPG - lgRPG)/9*IP * -1
or
RAA = (RPG - lgRPG)/26.25*Outs * -1

Pretty straightforward, right? Just subtract league average runs per game from the player's runs, then extend it out to the full number of innings or outs (depending on how you're calculating runs per game) in a season. I multiply by -1 to convert this from runs allowed above average to runs saved above average, because I find that this makes it more straightforward to interpret (positive numbers are good).

Update (1/18/08): I currently include an adjustment for lgRPG based on expected differences in relievers and starters, similar to what I do for RAR (see below). In this case, the assumption is that the average pitcher will allow runs at 89.5% of league average as a reliever, but 110.5% of league average as a starter.

Runs Above Replacement

RAR = (RPG - Y*lgRPG)/9*IP * -1
or
RAR = (RPG - Y*lgRPG)/26.25*Outs * -1
(For starting pitchers, Y=1.28. For relief pitchers, Y=1.07)

This is the same as the RAA equation, except for the Y coefficient. That's where things get a bit interesting.

The issue is that there are two different roles for pitchers: starting and relieving. In general, pitchers perform much more poorly as starters than they do as relievers (see pp.201-207 in The Book for a nice study on this, or this thread and this thread for some additional estimates and arguments). Therefore, we need to use different baselines depending on whether a pitcher is being used as a starter or a relief pitcher. Unfortunately, probably even more so than for hitters, there is not a clear consensus on what numbers we should use to do this. For example, as far as I can tell, Tom Tango, MGL, and Patriot all use slightly different values for starter and reliever replacement level. At this point, until I do my own study on this, I'm just going to pick Tom Tango's numbers because it seems like he's done a lot of thinking/analysis on this issue. That's not to say that I'm sure his numbers are correct, or that the other numbers are wrong, but his numbers make sense and seem to be consistent with empirical data.

Anyway, Tango argues that a replacement pitcher, used as a starter, will produce a 0.380 winning percentage, which means he will allow runs at 128% of league average. In contrast, a reliever will be good for an 0.470 winning percentage, and will allow runs at 107% of league average. The latter number may surprise folks, because it means that replacement level for relievers is very close to league average! Not good news for the Reds' bullpen, or for evaluations the effectiveness of the Reds' front office.

There are pitchers, of course, who serve as both starters and relievers over the course of a season. Ideally, we'd treat a pitcher's relief outings separately from his starting outings and then sum the RAR together, but that's not possible to do if you're working from a single row of data per player like I often am. Therefore, I'm going to borrow from Patriot's approach and categorize pitchers this way: starting pitchers are defined as those who made at least 50% of their appearances as starters, or who started at least 15 games in a season. Relievers are everybody else. It's not perfect, but it'll get us pretty close to the mark.

One last point: thinking about it now, a starter/reliever adjustment should probably be done to the RAA calculations too. But given that I generally prefer RAR to RAA, I'm going to ignore that for now...someone can fill in the blanks for me if they like. :)

2007 Cincinnati Reds

Below I'm reporting RAR values for the '07 Reds using Base Runs and FIP Runs. "True" runs above replacement is very similar to Base Runs (correl = 0.97), but as I said above, I'm partial to Base Runs because they are less confounded by the performance of other pitchers. Therefore, I'm opting not to report those values to reduce the clutter in this table.

Update (12/3/07): I discovered I accidentally was using the wrong replacement-level values in my spreadsheet. Starters got a bump upward, relievers got a bump downward. Oops!
Base Runs
FIP Runs
Pitcher IP RAR
Pitcher IP RAR
AHarang 231.7 61.4

AHarang 231.7 56.0
BArroyo 210.7 25.9
BArroyo 210.7 30.9
KLohse 131.7 17.5
MBelisle 177.7 25.9
MBesile 77.7 13.7
KLohse 131.7 20.4
JBurton 43.0 13.5
DWeathers 77.7 8.3
DWeathers 77.7 11.7
BLivingston 56.3 8.0
HBailey 45.3 4.4
JBurton 43.0 4.9
BSalmon 24.0 3.2
HBailey 45.3 4.9
TShearn 32.7 2.2
EMilton 31.3 4.4
JCoutlangus 41.0 1.9
MStanton 57.7 3.1
BLivingston 56.3 1.6

BBray 14.3 2.6
BBray 14.3 0.4

MMcBeth 19.7 1.8
EMilton 31.3 0.4

JCoutlangus 41.0 1.3
EGuardado 13.7 -1.0
BSalmon 24.0 1.3
MMcBeth 19.7 -1.4
GMajewski 23.0 0.4
RCormier 3.0 -1.4
EGuardado 13.7 -0.3
ERamirez 16.3 -1.9
RCormier 3.0 -1.3
RStone 5.3 -2.8
TShearn 32.7 -2.6
VSantos
49.0 -3.4
ERamirez 16.3 -3.2
MStanton 57.7 -5.2
MGosling 33.0 -3.7
GMajewski 23.0 -7.6
VSantos 49.0 -4.0
MGosling 33.0 -8.1
RStone 5.3
-4.4
KSaarloos 42.7 -9.1
KSaarloos 42.7 -5.0
TCoffey 51.0 -10.4
PDumatrait 18.0 -5.6
PDumatrait 18.0 -17.1
TCoffey 51.0 -6.4

Brief Notes:
  • Aaron Harang was clearly the most valuable pitcher on the staff. Duh. However, his value estimate of 61 runs above replacement also puts him well over top-ranked position player Brandon Phillips, who I estimated at just shy of 40 runs above replacement. Phillips' outstanding PMR ratings will give him a bit of a boost when I update those numbers. Nevertheless, it will not be enough to catch Harang, the 2007 Reds MVP.
    • As an aside, a 5.5 WAR pitcher, which is what Harang could potentially be projected to be in the future based on this analysis, is worth roughly $24 million/year as a free agent according to Tom Tango's pay scale. Even if he "declines" to average "just" 4 WAR a season from here on out, that's still worth $20 million/year. That 4-year, $37 million extension prior to this season is looking pretty darn good, eh? One of the moves that Krivsky really got right.
  • The player getting the biggest boost in the FIP Runs column is Matt Belisle, who goes from scrub to respectable. Belisle's peripherals weren't terribly different from those of Bronson Arroyo, but his BABIP was a tad high at 0.326, and his FIP (4.54) looks a lot better than his actual ERA (5.32). If he can post a mid-4's ERA next season it would go a long way toward solidifying the Reds' rotation.
  • Falling the other direction was Jared Burton. Jared had a fine first season, especially given that he was making the jump from AA to MLB this year. Nevertheless, his walk rate (4.9 bb/g) was unacceptably high, and will have to improve if he's going to continue to be successful out of the pen. Fortunately, at least as a trend, his control was much improved during the last month or two out of the pen, giving hope that he can really be a force next season out of the pen.
  • What on earth happened to Todd Coffey?
Update (12/3/07): The reliever values reported above should be considered preliminary. As detailed in the comments below, I neglected to consider leverage when assessing reliever value. I'm working on an adjustment to correct this problem. Basically, David Weathers gets a nice boost. :)

The next in the player value series is a piece on runs environments, including park factors and custom team linear weights. That might get delayed for a bit though--I'm writing for the Hardball Times Season Preview again this season, and that's going to occupy a lot of my time over the coming weeks. Should be fun! BTW, if you haven't already, go here and order both the Hardball Times Annual and the Season Preview together and get a 10% discount (use code HTC08)! :)
Aaron Harang photo by Charles Rex Arbogast