Table of Contents

Saturday, October 27, 2007

Player Value, Part 3a: Fielding Performance Estimators

To view the complete player value series, click on the player value label on any of these posts.

In the first article of this series, we found that a run prevented on defense is worth just as much as a run scored on offense. Nevertheless, for many fans, analysts on TV, and apparently even some within baseball, position player evaluation starts and ends with offense--if anything, consideration of fielding value is used as a tiebreaker. Fortunately, we now have access to a wide variety of quality statistics that can estimate fielding performance with reasonably accuracy. In this and the next few articles, I will do a run-down on some of these options, compare them to see how well they agree, make recommendations on how to use them, discuss how to do position adjustments to put fielders on even footing, and finally, take a look at the 2007 Cincinnati Reds!

So let's get started.

Fielding Percentage and Range Factor-Type Statistics

Virtually all fielding statistics operate on this simple equation:
Fielding = Outs/Opportunities
The primary differences between them have to do with how they measure outs and (especially) chances. Some also factor in a comparison of a player's out-rate to league average at that position to convert the rate into a +/- number, but it's still built upon an assessment of that rate.

The first and most basic (and flawed) effort to measure defense was fielding percentage, which was calculated as:
FldPct=Outs/Chances = [Assists+Putouts]/[Assists+Putouts+Errors]
This statistic traces back into the early 1900's, when error rates were high enough that they actually provided a pretty good estimate of fielding. And it's still the most commonly cited fielding statistic outside of stat circles--you'll often hear TV commentators, for example, say that the Rockies led all of baseball this season (and, in fact, all of history) in "fielding," meaning they had the highest team fielding percentage in history.

Errors certainly are bad things. However, the problem with using fielding percentage as one's primary fielding statistic is that error rate only evaluates part of fielding performance: making an out once you get to a ball. It completely ignores the ability of fielders to actually get to the ball in the first place, which, our modern fielding statistics tell us, is where the biggest differences among players actually exist.

In an early attempt to rectify this problem, Bill James developed range factor, which is calculated as simply:
RF = Outs/Game = (Assists + Putouts)/G
Makes sense, right? Now we're trying to assess the rate at which players actually make outs, not just the rate at which they make mistakes.

Unfortunately, there is a significant problem with range factor. What James was trying to approximate with range factor was Outs/Opportunities. Because he was limited by data at the time, he had to make the assumption that Opportunities/G was more or less constant, at least within a position. If true, then range factor (outs/G) would tell you the same thing as Outs/Opportunities.

To evaluate the extent to which this is true, here are some data extracted from this article by MGL, showing 2002 starting shortstops (Games and Outs come from traditional scorebook stats, whereas Opportunities were extracted from mgl's play-by-play data and represent the number of balls hit toward a player--whether a player converted them into an out, made an error, or went through for a hit):
Name Team G Outs RF
Opps Opps/G Outs/Opps
Fox Fla 112 500 4.5 265 2.4 1.9
Izturis LA 128 535 4.2 290 2.3 1.8
Clayton CWS 109 509 4.7 278 2.6 1.8
Gomez TB 130 616 4.7 345 2.7 1.8
Aurilia SF 131 571 4.4 327 2.5 1.7
Rodriguez Tex 162 766 4.7 443 2.7 1.7
Guzman Min 147 637 4.3 378 2.6 1.7
Bordick Bal 117 593 5.1 354 3.0 1.7
Perez KC 139 674 4.9 405 2.9 1.7
Cruz SD 147 617 4.2 380 2.6 1.6
Furcal Atl 150 731 4.9 451 3.0 1.6
Guillen Sea 130 530 4.1 334 2.6 1.6
Ordonez NYM 142 655 4.6 416 2.9 1.6
Gonzalez ChC 142 599 4.2 382 2.7 1.6
Hernandez Mil 149 735 4.9 469 3.1 1.6
Uribe Col 155 812 5.2 521 3.4 1.6
Eckstein Ana 147 626 4.3 406 2.8 1.5
Vizquel Cle 150 701 4.7 461 3.1 1.5
Garciaparra Bos 154 708 4.6 481 3.1 1.5
Larkin Cin 135 624 4.6 425 3.1 1.5
Cabrera Mon 153 748 4.9 519 3.4 1.4
Jeter NYY 156 594 3.8 415 2.7 1.4
Rollins Phi 152 696 4.6 488 3.2 1.4
Wilson Pit 143 709 5.0 499 3.5 1.4
Renteria StL 149 638 4.3 449 3.0 1.4
Womack Ari 149 568 3.8 422 2.8 1.3
Tejada Oak 156 724 4.6 539 3.5 1.3

From the above table, you can see that James' assumption does not hold. There is substantial variation in these players' average number of Opportunities/G, which all but masks variation in the rate of Outs/Opportunities in the range factor calculations. In fact, the correlation between range factor (outs/G) is dramatically higher with the rate of Opportunities/G (r = 0.60 in this dataset) than it is with Outs/Opportunities (r = 0.10)! Therefore, range factor tells you almost nothing about how many outs a player makes given his opportunities, but rather how many balls are hit his direction.

There have been subsequent attempts by James and others to try to better control for opportunities. Perhaps the best, and certainly the most widely circulated numbers of this kind are the Davenport fielding translations (Fielding Runs Above Average, FRAA) available at Baseball Prospectus, which report fielding prowess as runs saved above average. Unfortunately, to my knowledge, the actual methodology by which these "improved" numbers are generated has not been published anywhere, which means it's hard to know exactly what's going on under the hood (this is a chronic problem with BPro stats). Furthermore, because they are not based on hit-location data (like other stats below), they are not well-regarded among most baseball researchers.

My experience with BPro's FRAA numbers is that they're "pretty good." So I'm going to include them, for now, among the numbers that I'll compare in the next article.

Zone-Based Fielding Statistics

There are four variants of zone-based statistics that you will run across with any frequency: UZR, ZR, RZR, and BIS's plus/minus system. Let's run through them one at a time:

Ultimate Zone Rating (UZR)

UZR is often heralded as the gold standard of defensive stats. I agree that it's very good, but as we'll see in the comparisons article, I'm not convinced it's the only thing worth paying attention to. It is the creation of Mitchel Litchman (aka MGL), and his methodologies are described in detail in these two posts at BBTF. In its most basic form, it's a pretty simple procedure, and it serves as a nice model to understand the other zone-based stats, so I'd like to walk through it.

Essentially, the ball field is broken up into different zones, which are the hit location zones defined by Project Scoresheet/Retrosheet (see figure to right). If one pays $10,000 to get hit location data from STATS Inc (which no individual fan except MGL has been willing to do), one can calculate how many balls were hit into each of these zones. And, for each position within each zone, one can determine the percentage of those balls that were typically converted into outs.

From the raw data, one can also measure how many balls were hit into each zone when a particular player was on the field at a given position, and how many of those balls were turned into outs by that player. And using that information, you can get the percentage of batted balls that the player converted into outs within that zone.

Let's say that the average shortstop converts 21% of balls hit into zone "56" into outs ("56" is the zone corresponding to the "hole" between third base and shortstop). And, let's say that Larry Barkin played the entire season as the Reds' starting shortstop, and he converted 25% of balls hit into zone 56 into outs. Based on that information, I think most folks would be comfortable saying that he was better than average, at least on balls hit into that zone.

But how much better? Well, 25%-21%=4%. But how much better is 4%? Well, let's say that there were 100 balls hit into zone 56 while Barkin was playing. If the average shortstop turns 21% of zone 56 balls into outs, that means the average shortstop would be expected to make 21 plays in Barkin's situation. And yet Barkin made 25 plays. Therefore, I'd say that Barkin performed 4 plays above average in zone 56. Now, let's say that we did the same procedure in all other zones on the ball field, and Barkin's rate matched the actual rate exactly in those other zones. The summed difference between Barkin's rates of making outs and the average shortstop's rates of making outs would then be +4 plays, which would be entirely due to his excellence in zone 56.

Ok so far? What if we want to know the approximate run value of this better-than-average performance? Well, using linear weights, we can determine the average runs value of hits that go through zone 56 around the league. For the purposes of this illustration, let's say that every ball hit through zone 56 in baseball turned into a single (in reality, some turn into doubles or triples, but this isn't too far off). The marginal linear weights value of a single is ~0.460 runs, so Barkin's performance prevented ~0.46 x 4 = 1.84 runs from occurring via these singles. Furthermore, he also generated four additional outs, each of which is worth ~-0.265 marginal runs, so he prevented 0.265 x 4 = 1.06 runs by generating these four outs. This puts Barkin's total fielding value at shortstop, given that he was +4 plays above average, as 1.84+1.06 = +2.90 runs saved above average (~0.725 runs saved per play above average). This is the estimated improvement in defensive performance over what you would probably have gotten from a completely average shortstop had he played instead of Barkin.

This, in a nutshell, is what UZR does. You'll note that the rates from which we're calculating the +-plays above follow the same generalized formula that we presented at the top of the article: outs/opportunities. It's just that instead of having to assume that opportunities are only those balls that a player got to, or that the average number of opportunities per game was constant, the zone system allows us to get a much better estimate of the number of opportunities a player actually had.

In reality, UZR is a bit more complicated than I indicated here, using different rates to account for batter handedness (attempts to adjust for positioning), how hard a ball was hit (to account for difficulty), as well as adjustments for defensive park factors and a few other things (you can refer to MGL's two articles for more info). Nevertheless, this should give you a basic understanding of how it all works. And that should help you understand the remaining stats as well, as they're all pretty similar.

Zone Rating (ZR)

While MGL's treatment of the STATS Inc. hit location data is much better, STATS Inc. has, for a long time, released a rate statistic called Zone Rating (available from a variety of websites, including espn.com, cnnsi.com, etc) that does things in a fairly similar fashion to UZR. STATS divides the field up into a larger number of zones than UZR (see image to right), and then assigns any zone in which defenders at a given position, on average, convert more than 50% of balls into outs to be within that position's "zones of responsibility." Zones are not shared among positions, and some zones are not the responsibility of any player.

Next, for each player, STATS tallies up how many balls were hit into his zones of responsibility (balls in zone, or BIZ), and how many of them he converted into outs (PLAYS). They also tally up any plays a player made outside his zones of responsibility (out of zone, or OOZ). They then calculate zone rating as:
ZR = (PLAYS+OOZ)/(BIZ+OOZ)
Again, this is essentially the same outs/opportunities formula that can be traced back to Fielding Percentage. You'll note that it's sloppier than UZR, because all zones of responsibility are lumped into one grand zone, rather than using separate rates for each individual zone (this is ameliorated somewhat by using smaller initial zones, but not entirely). The treatment of plays made outside the zones of responsibility is also a bit sloppy--the OOZ term probably should not be in the denominator. But this is the number that we get from STATS. And, as you'll see in the comparisons section, it's pretty useful for ranking player fielding.

What if we wanted to convert the ZR rate to a +/- runs statistic like UZR? Well, without the actual BIZ data from Stats, all we can do is make estimates of opportunities. Fortunately, Chris Dial, who had access to some of the source data underlying ZR, developed a procedure that allows us to do this with reasonable accuracy. It essentially just uses average opportunities per inning at each position to estimate BIZ for each player (technically it's BIZ+OOZ, but we'll ignore OOZ for now), from which point you can back-calculate PLAYS for each player. After that, getting a +/- plays estimate is as simple as:

+/-Plays = PLAYS - BIZ*(lgPLAYS/BIZ)

Where the "lg" prefix just means league totals of the given statistic.

Dial also provided average runs/play estimates for each position (which I assume are based on actual data on typical hit types through each position), allowing us to convert the +/- plays stat into a +/- runs stat:
1B 0.798 runs/play
2B 0.754 r/p
3B 0.800 r/p
SS 0.753 r/p
LF 0.831 r/p
CF 0.842 r/p
RF 0.843 r/p
You'll note that these values are all slightly higher than the 0.725 r/p we defined up in the UZR section. That's because some (and the number varies predictably by position) of the hits through a position turn into something more than singles. This is especially a factor for outfielders and corner infielders.

All in all, while Dial's conversions of ZR aren't as good as UZR, they correlate well with it, and they are ultimately based on the same raw data. So they're definitely worth using, especially if you don't have UZR available.

Update: I recently discovered that the Replacement Level Yankee Blog has posted 1987-2007 ZR translations using Dial's methodology. 2002-2007 data use actual chances, which is pretty exciting. Nice resource.

Revised Zone Rating (RZR)

John Dewan, who was a founder of STATS Inc. and was the one who created Zone Rating in the first place, has since left that company and founded Baseball Info Solutions (BIS), which is now one of STATS Inc.'s competitors. Naturally, they created their own version of ZR! While there are a few minor differences in how the numbers are tallied, the most important thing about RZR is that it is available from The Hardball Times...and they report not only the RZR rate statistic, but also the "raw" BIZ, PLAYS, and OOZ data! This allows us to construct our own, more accurate +/- runs statistic using a simple process that I outlined here.

Basically, you get league averages of PLAYS per BIZ at each position, and then apply that average rate to each player's BIZ to get an expected number of PLAYS made. Then, you simply subtract this expected value from a player's actual number of PLAYS. Out of Zone plays are handled the same way. So, the equation is:

[PLAYS - (BIZ*(lgPLAYS/lgBIZ)] + [OOZ - (BIZ*(lgOOZ/lgBIZ)] = +/- Plays

Once you have your +/- Plays number, you can apply Dials runs/play figures (see above) to convert it to a +/- runs statistic.

There is some disagreement about whether one should use a rate based on BIZ or innings for the out of zone plays. I prefer to use BIZ over innings because it seems as though that would more accurately reflect the ground ball/fly ball tendencies and handedness of a pitching staff. It is possible, at least on a team level, to estimate the actual number of out of zone opportunities. However, at least in this thread, the suggestion is (based on yet-to-be-published studies) that doing this doesn't make that much of a difference compared to just using BIZ. And just using BIZ is a heck of a lot easier.

BIS's Plus/Minus System

The RZR numbers are based on the same raw dataset that Dewan's company has used to create their own advanced fielding metric, which they refer to as the "plus/minus" system. In the outfield, it essentially operates in the same way that UZR does, where a player's performance is compared within each zone on the field to the average performance of others at his position, rather than just in zones that are assigned to a particular position. On the infield, rather than using zones, it instead uses vector slices to divide up the field...though in practice, it's pretty much the same thing.

I like the plus/minus system. Unfortunately, it's not freely available, so we'll apparently have to wait a while before another fielding bible is released to get that data (I tried to purchase it at one point, but they wanted $100 for one year's worth of +/- data--and that was for my private use only!!).

Vector/Nonlinear Function-Based Statistics

Probabilistic Model of Range (PMR)


David Pinto's PMR operates on a slightly different paradigm than zone-based stats. Rather than using zones, he uses vectors, somewhat like what is used by BIS's plus/minus system. Using the batted ball vectors, he can (more or less) plot a function describing a player's fielding prowess relative to the rest of the league, which can be depicted in graphical form (see right). His system uses separate rates for each position depending on ball type (fly, ground, line drive), how hard each ball was hit, and the handedness of the batter and pitcher. The difference between actual performance and league-average performance for each of these ball types at each vector provides the resulting defensive rating. He also includes park factor adjustments.

Therefore, in many ways, this system is very much analogous to UZR...but for whatever reason, it doesn't quite get the hype. The primary criticism I've seen of PMR is that it's a bit too inclusive--it includes all batted ball types for all positions. This means that performance (i.e. ball-hogging) on infield pop-ups are among the ways that infielders can improve their PMR rating. Other stats, especially zone-rating statistics, only evaluate infielders on ground balls hit at or through their positions.

Nevertheless, given the park factor and handedness adjustments that are factored into PMR, I think it's among the better systems available. Pinto has released the data annually for the last three years to the public. He generally just reports the data in terms of the difference between expected and actual outs (i.e. +/- plays), but we can convert those data using Dial's runs/play data into a +/- runs format for our use.

Spatial Aggregate Fielding Algorithm (SAFE)

SAFE is the result of Shane Jensen et. al's work at the University of Pennsylvania, and has the potential to become the best fielding system available. For ground balls to infielders, it uses a vector-based approach like Pinto's PMR, but then fits a non-linear function to those data for each player and compares that function to league average to determine infielder performance on ground balls. The area between the curves is the difference between the player's fielding performance and league average.

The way they handle fly balls are even more exciting: they plot the landing location of each batted ball and then fit a three dimensional function describing the frequency with which a player fields fly balls in all locations on the field (see awesomeness to right). This function can be subtracted from a league average function at that player's position to determine a fielder's performance vs. average.

It's a very promising system, combining both improved precision in ball location information with improved simple sizes, since they don't have to break up the field into arbitrary zones. Unfortunately, perhaps because it is essentially a side-project for some academics (and thus not well-funded!), they have only released data through the 2005 season. Therefore, it's not something that we can use for current seasons. I'm hoping that some company out there licenses it from them and posts it on their site. Universities always enjoy it when their faculty get the entrepreneurial spirit, and we'd have some fantastic defensive data to work with. :)

The Fans' Scouting Report

Tom Tango has, for several years now, overseen a project that tries to quantify subjective impressions of player performance. The idea is that a composite depiction of fielding ability by informed fans is a useful way of getting information about fielding performance. I really like this project: it provides a completely unique dataset from the more objective systems, and, if the data are good (and they typically seem to be), they can really can flesh out our understanding of player skill in a way we can't get from our other fielding statistics.

Each fan scout is asked to rate position players on a scale of 1-5 in seven separate skills, ranging from reaction speed and "hands" to arm strength and accuracy. Evaluators are asked to ignore all fielding statistics, and to compare a player to all other players--not just those at his position. The resulting data are reasonably well-correlated with the "pure numbers" metrics, and yet provide some insights that we otherwise can't get about each individual player's skills.

Furthermore, Tango has converted the survey ratings into an approximate +/- runs statistic. He first produced a weighted average of the individual skills for each player at each position based on assumed importance of each of the skills to that position. Then, by comparing the average standard deviation in these averages (~14 points) to the typical standard deviation in runs saved according to UZR (~10 runs), he estimated that one ranking point in the weighted average is worth ~0.7 runs.

Unfortunately, Tango hasn't released his custom positional weights AFAIK. I could sit down and try to come up with my own, but I didn't feel comfortable doing that--I only played ball for one year, so I don't consider myself knowledgeable enough to make those sorts of judgment calls. So I instead cheated and pulled out my multiple regression tools to generate equations that predict Tango's FanRuns values based on his '03 data. Here are the resulting coefficients and intercepts for each position:
Pos Instincts Acceleration Speed Hands Release Strength Accuracy Intercept
3 0.095 0.122 0.018 0.110 0.045 0.038 0.057 -20.8
4 0.213 0.195 0.067 0.118 0.116 0.060 0.044 -41.0
5 0.089 0.080 0.047 0.084 0.083 0.168 0.089 -31.5
6 0.188 0.176 0.091 0.075 0.179 0.048 0.060 -47.3
7 0.077 0.127 0.145 0.064 0.043 0.020 0.013 -27.6
8 0.077 0.195 0.159 0.099 0.037 0.044 0.021 -35.7
9 0.069 0.131 0.125 0.060 0.029 0.030 0.037 -23.9
To get a +/- runs estimate on a player from these equations, simply multiply each of the coefficients by the relevant player rating in the fan scouting report, sum the values, and then add in the intercept. You will have to adjust the intercept to match the data such that the totals within each position sum to zero, as averages in player skills and the identities of the fans doing the scouting vary from year to year.

Note:
This approach "predicts" Tango's FanRuns in the '03 scouting report almost perfectly, and seems to produce reasonable estimates in subsequent years. Nevertheless, there are substantial correlations among most of the player skills, which can destabilize the regression coefficients. Therefore, it's entirely possible that using these equations on different datasets (e.g. different seasons) will produce estimates that differ substantially from those that Tango might produce using his own procedure. Nevertheless, looking at which coefficients are large or small at each position, these equations make sense to me: hands are important at first base, instincts and acceleration are important at second base and shortstop, arm strength is important at third base, speed is important in the outfield, etc. So I'm comfortable using them.

Update: In the comments of this article, Tom Tango directed me to a post in which he did report his custom weights by position. They are, on the whole, pretty similar in relative skill emphasis to the coefficients above. Nevertheless, I'd recommend using them to create weighted averages for each player, then subtracting the mean for that player's position to create a +-points statistic, and then multiplying by 0.7 runs/point to get your runs estimate. This actually does not predict 2003 data perfectly at all positions (I'm guessing that Tango's using slightly different weights now), but it does create fielding ratings that make sense. And, as we'll see in the next article, estimates that correlate well to other fielding statistics.

Next time, we'll compare each of these fielding systems quantitatively and see how similar their estimates of fielding performance are, and we'll work out how to apply these numbers to get fielding estimates for the 2007 Reds.