Table of Contents

Tuesday, September 18, 2007

August 2007 Reds REview Part 4: Fielding

Again, apologies that this is so late...
The Reds' defense actually appeared to take a step forward in August, posting their best DER of the year despite playing behind a pitching staff that was pummeled into little bits of Reds meat.

Below I've calculated the August splits for Reds defenders based on THT's revised zone rating data, using the process I described here. I've also taken the additional step to convert these +- play values into +- run values using the runs per play values given in this article by Chris Dial. The extra step adds an extra bit of uncertainty to the dataset, but at the same time they make it easier for us to interpret these I think the trade-off is worthwhile. You can, of course, ignore those values if you wish.

Last First POS Inn +-PiZ +-PooZ +-Total +-Runs
Cantu Jorge L 1B 29 -0.7 1.8 1.1 0.9
Conine Jeff 1B 50 -0.2 -1.7 -1.9 -1.5
Hatteberg Scott 1B 171 -0.1 1.0 0.9 0.7

Phillips Brandon 2B 243 0.2 1.4 1.6 1.2
Bellhorn Mark 2B 1 0.2 -0.1 0.0 0.0
Keppinger Jeff S 2B 9 -0.8 1.9 1.0 0.8

Bellhorn Mark 3B 13 -0.4 0.2 -0.2 -0.2
Keppinger Jeff S 3B 12 1.9 -0.9 1.0 0.8
Encarnacion Edwin 3B 228 -1.1 -1.4 -2.5 -2.0

Gonzalez Alex SS 108 4.3 -1.3 3.0 2.3
Keppinger Jeff S SS 146 3.7 -2.3 1.4 1.1

Hopper Norris S LF 3 0.0 0.0 0.0 0.0
Ellison Jason LF 25 1.0 -0.2 0.8 0.7
Keppinger Jeff S LF 11 -1.1 0.0 -1.2 -1.0
Dunn Adam LF 214 -6.1 -1.5 -7.6 -6.3

Freel Ryan CF 7 -0.7 0.5 -0.1 -0.1
Hopper Norris S CF 127 0.0 0.2 0.2 0.2
Hamilton Josh H CF 118 -0.8 -3.1 -3.9 -3.3

Hopper Norris S RF 11 0.4 0.5 0.9 0.7
Ellison Jason RF 11 0.5 0.3 0.8 0.7
Keppinger Jeff S RF 7 0.1 -0.2 0.0 0.0
Griffey Jr. Ken RF 223 1.8 -5.4 -3.6 -3.0

The Good

Alex Gonzalez had the most impressive showing among all Reds defenders in August, saving roughly two runs above average while playing less than half of the time. That's the sort of player we were told we were getting when we signed him in the offseason, so that's nice to see. As in prior months/years, he seems to do it primarily by being extremely reliable on balls hit in his direction, as his OOZ ratings aren't particularly impressive.

Even more exciting to me, however, was the fact that Jeff Keppinger posted roughly average (if not above average) defensive numbers at shortstop for the second straight month. Obviously, he's not going to have great range at shortstop, as evidenced by his below average ratings out of the zone. But if he can continue to be reliable in zone, he could turn out to be quite a find. Keppinger might be capable of posting an 0.800 OPS at shortstop over a full season, and if he can play average defense at the same time, he would probably be more valuable than Gonzalez to the Reds in '08. ... And that might make a trade of Gonzalez a viable thing to do, which might free up some money for additional pitching...? Just thinking out loud here.

Phillips also turned in another fine performance, pushing his season totals to ~13 runs saved above average. Combine that with 39 VORP on offense through the 17th of September, and you've got yourself a heck of a season. Almost certainly the Reds' MVP this season, at least among position players.

The Bad

Adam Dunn's fielding numbers looked horrible in August. This is a real disappointment, because I keep hearing on the radio how much his defense has improved since Pete MacKanin took over the ballclub. According to Revised Zone Rating, he had easily the worst month of any Reds defender. In fact, he passed both Encarnacion and Griffey to take the lead as the worst fielder on the ballclub, costing the team an estimated 19 runs in his time in left field compared to an average defender.

Dunn currently (again, through 17 September) leads the team with a VORP of 46 runs, but his defense pulls his overall value down to around 27 runs. That's ~40% drop in value! Yikes. It certainly doesn't negate his value completely, but it's not exactly inspiring either. ... I hate to feed the various Dunn haters out there who always seem to start citing this sort of thing when I post it, but Dunn's defense is enough of a problem these days that we really do need to more explicitly factor that into how we rate him as a ballplayer.

By contrast, Norris Hopper has thus far saved the Reds an estimated 13 runs above average across all three outfield positions (the majority coming early in the season in left field). If we add that defensive performance to his VORP (11 runs), he checks in as being worth roughly 24 runs above the average replacement player. ... in roughly half the PA's as Dunn. I don't think he can keep up that kind of pace on offense or defense over a full season, but the data do give us some perspective on how much the guy has contributed to this team when he has played.

Other poor performances in August came from Hamilton, Griffey, and Encarnacion. Hamilton was playing with leg troubles for some of the month, but he is still probably playing out of position when in CF (he's a natural right fielder). Griffey just keeps on doing what he's been doing all season -- he's fine on balls hit reasonably close to him (+1.7 plays on balls in zone), but makes very few plays out of zone, which really costs his defensive value.

Finally, Encarnacion may have shown some improvement, costing the Reds "just" two runs compared to roughly 5 runs in July. I continue to wonder if the positive UZR split we saw earlier this year was an abberation, or if he really is capable of playing at least average defense at the hot corner in future seasons. If his defense doesn't improve, he's going to have to be a monster of an offensive player to have value...and honestly, while I think he can be a fine hitter, I'm not confident that he'll ever be a great hitter.

Photos by AP/Al Behrman


  1. Thanks for the August monthly evaluations. Solid work as always. First off, I think it is a little unfair to refer to people that point out flaws in a player's game as 'haters'. I would choose the term 'realists'. Just because you point out a flaw does not mean you 'hate' the player in question. It just means that you realize that hitting a ball a long way is only half the equation. This isn't the American League. A player still needs to bring his glove to the ballpark in the NL. Which is why the National League is the better league.
    One question:How much faith do you have in your evaluation methodologies? I ask this for one reason. If you have a lot of faith in them and, without naming names, it was noted that Player A was 27 runs above average, and Player B was 24 runs above average in only half the playing time, what would be a logical conclusion to be reached here? And while it is true that Player A has much longer track record, Player B is reaching the point where one has to at least consider the possibility that what he is doing isn't a fluke.

  2. Actually, I wasn't referring to folks like yourself who legitimately critique Dunn's defense. I was talking about the people who do nothing but beat the drum about how terrible a player he is. They absolutely exist, and tend to link to my site whenever I post something negative about the guy. :) I will say that Dunn's power is only part of his offensive value... His OBP is probably even more significant, or would be if the Reds leveraged him properly. I get really tired of hearing about how the only thing he does is hit home runs...

    As for the uncertainty of these's hard to say for sure. I think Hopper's LF defensive numbers are a bit inflated due to small sample size, because he was among the league leaders in the first half despite playing a really paltry number of innings...and while he looks good out there, he's not vastly better than all the other left fielders in baseball, which he'd have to be for those numbers to be real. Unfortunately, I don't have a good way to quantify either measurement and sampling error with those data.

    We can get a better idea about offensive number uncertainty because there's a better literature on it. According to Table 12 in The Book by Tango et al., the 95% confidence interval around a player's OBP (Hopper's most relevant stat, as he has no power) with 200 plate appearances is +-0.033. With 500 plate appearances, it's +-0.021. Hopper currently has 291 plate appearances, so if we generously split the difference in the uncertainty estimates, his "true OBP" (meaning that due to skill and not luck) could reasonably be expected to vary anywhere from 0.341 to 0.395 (currently 0.368). Given that his BABIP is unusually high this season (0.372), I think we have to expect it's closer to the lower end of that range.

    This tells me that he's likely to be a marginal offensive player in the future, at best. As I mentioned in the august hitting review, his bunting skills might let him achieve a slightly higher BABIP than usual, but I just don't see him posting much better than a 0.350 OBP in the coming season (again, with zero power).

    That can certainly be valuable, provided he's in center field where replacement level offense is weaker. Hopper's defense in center field probably isn't substantially above average based on his numbers this season (+1 run in CF). Which means we're looking at someone who is still a below average player as a starter. As a 4th outfielder, though, he can be quite valuable.

    In contrast, Dunn's combined defense and offense probably put him at about average value (~30 runs over replacement) over a full season. So I'd still start Dunn. -j

  3. I know you've probably posted this before, but how do you deal with OOZ in calculating runs. I'm assuming that Dial's method involved OOZ being built into the numbers, so I wasn't sure how you were working with OOZ that had been broken out.

    And I'd say that Hopper's OF defense is still up for debate. If he were to maintain his current LF level, his RZR of .962 would be higher than any outfielder over the last 4 years by almost 1.5%. Plus, he was making a ridiculous number of OOZ plays (12 compared to 25 BIZ). That pace is twice as high as any LF over the last 4 years, and he'd end up with about 30 more OOZ than Ichiro has put in CF this year. I just don't think it's a realistic pace.

    And I agree with you Justin on Hopper's offensive future. He's a player who's value is so strongly tied to his batting average that I would expect to see lots of fluctuation in a his numbers on year by year basis.

  4. Also, do you know how Dial came to the average run values per position? It seems like the OF plays are too high to me. Linear Weights puts a double at around .78 runs, so doesn't his run values mean that any play not made in the outfield is worth more than a double? Am I misinterpreting that?

  5. Here is my question: should the Reds reconsider the idea of putting Dunn at 1B? Or would he be just as much of a defensive liability there as in the OF?

  6. @Joel: Dude, I included a link to my methodology post. Would it kill you to click a bit? :D j/k

    Basically, I calculate +-Plays by getting an average rate of plays in zone per ball in zone in all of MLB. I then calculate how many plays a player should have made based on the number of balls hit into his zone(s).

    +-OOZ is done in the same way. Get the MLB average rate of ooz/biz, then estimate expected OOZ for the player based on his balls in zone. Now obviously that raises some issues, because balls in zone isn't the best denominator for balls out of zone. But a) I judged (without maybe it's not appropriate) that BIZ was better context than innings, and b) it is consistent with how OOZ plays have been handled in the past. It's definitely imperfect, but I think it's critical to incorporate OOZ plays into one's estimates. If you don't do that, Griffey actually comes up as an average defender!

    As far as Dial's run values go, a) they are indeed meant to be used with STATS Inc's ZR, which does incorporate OOZ plays (though they badly underestimate the effect of OOZ plays by adding OOZ to the denominator as well as the numerator), and b) no, I haven't been able to figure out where he got them. I was just looking for a quick 'n dirty runs conversion, and since Dial's ZR translations have worked well before, I figured they were probably ok to use to get a rough initial translation. Dial used to work at STATS if I remember right, so maybe he had access to some data that helped him come to those values?

    However, I think your point about the linear weights values of doubles is a very good one and it makes me worry. It indicates that perhaps all of those run values (not just for outfielders) might be overestimates. It's going to be hard to figure out more appropriate conversions without better data though.

    I guess one approach we could take is to mirror what John Dewan did with his +- system: assume that all missed plays to shortstops and second basemen are singles, and that the plays to outfielders and corner infielders are a mix of singles and doubles (perhaps 50/50? ...that's probably too many doubles, but might make up for the fact that we'd be ignoring triples). We could then use the linear weight values to estimate runs lost. Doing this would result in much lower estimates of defensive runs (I'm guessing ~30%ish lower)...but maybe it's a good idea to be conservative given how poorly all the various fielding metrics agree?

    Also, I'm in agreement with you about Hopper (see my comment #2). I think he's a plus defender, but he's not Willie Mays.

  7. @Alex, players generally do have better fielding numbers at 1B than at LF based on a small study by Tom Tango earlier this season. So I think that moving Dunn to 1B is definitely something that the Reds need to consider.

    The problem, of course, is that they have Votto to deal with, and he should be playing. But I dunno--maybe Votto would be better in left field than Dunn is?

    It's worth noting that Dunn was actually passable in left field until '06-'07. He wasn't quite average, but was only a few runs below average. So it's surprising to see him posting such bad numbers over the last two seasons. Maybe he's lost a few steps?

  8. @Joel - Just thinking out loud here...but another approach to get runs estimates might be to compare mgl's UZR data to the Hardball Times' RZR data at the end of the season. We might be able to base the Runs estimate from a regression coefficient.

    The problem there, unfortunately, is that RZR and UZR tend to be badly correlated. And poor correlations would probably reduce the regression coefficient more than is appropriate. In fact, it's frighteningly bad at times, especially in the outfield where correlations can be on the order of 0.1-0.2. Part of that is that UZR includes park factors, adjustments for handedness of batters and pitchers, etc. But, as a series of articles earlier this monthish THT showed, another part of it is that STATS Inc and BIS's datasets don't agree particularly well. -j

  9. Dude, I included a link to my methodology post. Would it kill you to click a bit?

    Oh, what, so now I'm supposed to actually click the links that you post? What's next? Am I going to have to start reading your tables of data too?

    I agree that OOZ plays are important, I just didn't know how you were handling them. I agree that BIZ probably isn't the best denominator for OOZ, but I can't think of what would be better. Is it just me, or would you agree that to get a proper handle on OOZ, you really need a large sample? BIZ seems like it can get by with a smaller sample since the zones are defined for the position and we know how many balls were in the zone. Sure things can change much more over a larger sample, but if a guy has a 100 BIZ, we can reasonably estimate if he is a stable defender or not (i.e. he makes the plays he's supposed to make). However, for someone like Hopper, I don't think we know his value in LF because so much of it tied to a large number of OOZ that could just be flukish. Am I making sense?

    Dial used to work at STATS if I remember right, so maybe he had access to some data that helped him come to those values?

    I'm guessing Dial did some digging into the play-by-play data and calculated the run value of plays missed in the zones over a period of time. Still though, it seems like those values are high as I have a hard time believing that a missed ball in the outfield is always worth between a double and a triple. Though, I suppose the value could be different if you look at it from the perspective of what it means to make a play. Since most balls in zones are more or less expected outs, it could be the difference between the run value of the out compared to the typical hit. I could see that being higher than the value of a double. But if that's the case, then a different run value should be used for OOZ than BIZ. Wouldn't you agree?

    Just thinking out loud here...but another approach to get runs estimates might be to compare mgl's UZR data to the Hardball Times' RZR data at the end of the season. We might be able to base the Runs estimate from a regression coefficient.

    Okay, you lost me. Too much math for my brain.

  10. I was wondering if you know how they determine how to divide the field into zones? Division of the infield should be pretty much standard. But outfield size and dimensions very greatly. Do they divide the outfield into three equal parts? Is each outfielder expected to cover an equal amount of ground? Houston's stadium has an extremely short left-field and an extremely large center-field. Also, is foul ground considered at all? Or are any foul balls that a fielder may reach a bonus? I would appreciate any light you could shed on this for me. Thanks in advance.

  11. From the Harball Times:
    The areas on a ballfield in which at least 50% of batted balls are handled for outs. Zones are standardized and defined separately for each position.

    This implies that the whole field is not covered by zones, rather there are gaps, especially in the outfield. I don't know if this is true, but it seems true based on that definition.

    As for the actual zones, I think those are proprietary to BIS. I don't know anywhere that has posted them. Here is the zone division from STATS Inc. I don't know which specific zones apply to each position, but at least you can see how they divide up the field.

  12. @Joel - I'm going to send David Gassko an e-mail. He recently posted some THT data converted to +- runs, and I'd be interested to know what he did in this regard.

    I think your point about the difference between a hit and an out is a really good one, and might explain these values. The point about the OOZ is really interesting too. Seems like OOZ plays are more likely, at least for outfielders, to end up as doubles. In-zone plays might end up as doubles, but on a shallow fly, for example, they might just end up singles. So the runs value of an OOZ play might be higher than an in-zone play.

    @Dave - Joel has it right, though I think BIS uses slightly larger zones than STATS Inc does to increase sample sizes within each zone.

    Basically, the field is first divided into zones. And then for any one position, they determine those zones in which fielders made 50% or more of the plays. Those are the ones considered "in zone" for that position. It does mean that there are large portions of the field that are not attributed as the responsibility of any defender.

    I don't really like that approach, to be honest. I'd rather get an average probability that a play will be made within each zone by a particular position, and then compare that to a player's actually rates within each zone, summing up the differences to get an overall rating. That's exactly what Dewan did in the Fielding Bible with his new +/- stat, but unfortunately that's not available to the general public. They told me they'd sell it to me for my personal use only for $100, but that's a hard sell to the wife. :)

  13. Maybe the NL should adopt the DH...

    Just kidding.

    Maybe Dunn's numbers are worse the last two years because he has been in LF instead of RF. I know that casual fans usually assume there is no real difference between playing those two positions, but I doubt that is actually true for all players Perhaps Dunn is a better RF than LF for some reason.

    I have always been a Dunn supporter and I think many are way too tough on him. But these defensive numbers are, frankly, atrocious.

  14. I was thinking the run value of an OOZ play might be lower because while a BIZ play assumes the out, you can't do so on an OOZ play. That is, a BIZ value would be calculated sort of like (absolute value of out + average value of hits), whereas an OOZ play would just be the average values of hits. Sure, the average value of hits that fall in the OOZ areas would probably be higher, but not likely as high as the average value plus the value of the out. At least, that's just a guess.

    Hopefully Gassko will be able to shed some light on the subject.

  15. @Alex - generally, players do better in LF than in RF (I've seen studies of this, but haven't located them in a quick search...I can probably find 'em if you're interested). I'm not sure that the position itself is more demanding, but teams tend to put better defenders in right field than left, which means there's a higher standard over there. ... I guess there might be an advantage to playing right field as a lefty, though, because you wouldn't have to spin around to throw on balls hit down the line. But, of course, that advantage might be offset by the fact that you're having to backhand line drives hit that way.

    @Joel - I think(?) you're making the assumption that players shouldn't be penalized for missing balls out of zone, and only rewarded for making those plays. I disagree with that. I think we should expect a certain frequency of plays made on balls out of zone. It's just that the expected frequency of those plays is far lower than plays in-zone (based on this season's averages, we should expect ~80% of plays in zone to be made, but only ~18% of plays out of zone to be made, again using BIZ as the denominator for both calculations). The point is to compare any fielder to the average performance of all fielders at his position.

    The way I'm looking at run values is that we want to know the average difference in run value between making a play and not making a play. I was thinking that the consequence between making and not making a play out of zone would be higher than in-zone, because I *think* you'd be more likely to be dealing with balls in the gap among plays out of zone. But then again, little flares in the shallow outfield probably are out of zone too, and they're usually not going to be anything but singles.

    One dataset that might inform this question is the original methodology post I mentioned. There, I investigated how we should weight plays made in and out of zone by comparing these data to David Pinto's PMR. Except for third base, which has all sorts of other peculiarities, I found that a 1:1 weighting of in zone and out of zone balls seemed appropriate. So maybe we should just treat in zone and out of zone plays the same way for our runs calculations...?

  16. Honestly, the more I look at it, the more I think that the right way to do it is how you said Dewan does it in the Fielding Bible. I feel like THT's ZR is more accurate than STATS Inc, but ultimately, OOZ are going to be somewhat misvalued because we don't know if the OOZ play was made in a zone where 10% of plays turn into outs or 40% of the plays turn into outs. It may not be a significant difference, but I would guess that balls in the 10% are almost always extra bases while balls in the 40% area are more frequently singles.

    I guess I'm just frustrated because this data is proprietary. That's why I think you should quit your job and start watching all of the games every day and tracking this information so it is freely available on the internet. If you were truly dedicated to your readers, you'd do it.

  17. I think this is, by far, the best overall methodology for evaluating fielding--and it results in the kewlest looking figures. :) Unfortunately, they aren't releasing '07 data, probably because they haven't paid for in-season data.

    I guess the alternative to quitting my job is doing what MGL did and shell out $10,000 a year for those data! :D I could set up a paypal account for donations... :)

    Money really is the bane of baseball analysis. Between BIS's fielding data and BPro's frustrating tendency to keep all of their methods a secret, we're probably a few years behind where we could otherwise be. But then again, if no one could make money on this stuff, we might not have people dedicating themselves full-time to analysis. ... still, the proprietary stuff really is an impediment to amateurs.

    Still, I'm betting that it will only be a few years before ESPN et al start finally publishing high quality fielding stats (stats inc's ZR doesn't count). It's bound to happen within the next five years or so, right? I hope?

  18. I heard back from Gassko -- he used Dial's numbers.

    The more I look at them, the more comfortable I am with least for want of better referenced data.

    As per the above discussion, I'm not sure that we can reasonably treat out of zone plays differently than in zone plays unless we get data with better resolution like that which Dewan uses in his +/- system. Gassko reports that Dial's data are based on the average value of a missed play (i.e. average run value of a single/double + run value of an out, adjusted for the typical rate of singles/doubles from a particular player), presumably based on data with better resolution than I have. And furthermore, they look like they're in the right ballpark:

    Based on this article by Tango et al., these are the average increases in runs scored compared to an out for the relevant plays:
    Single - 0.77 runs
    Reached Base on Error - 0.78 runs
    Double - 1.08 runs
    Triple - 1.56 runs

    Dial's values range from 0.75 runs/play for middle infielders to 0.8 for corner infielders to 0.83-0.84 for outfielders. So they're heavily skewed towards singles, which is appropriate. But positions where hits are more likely to turn into doubles have higher estimated run totals/play.

    If anything, I'd say that Dial's values seem a bit conservative. For example, a missed play at shortstop should at least be worth the difference between a single and an out. But I'd prefer, at least for the time being, that estimates of defensive impact on player performance be conservative. And because they are published and have been used by others, I'm inclined to stick with them until we find something better. After all, in the above article I've already estimated that Dunn's defense has diminished his value by 40% this season. As a Reds fan, I don't like reporting that, and certainly don't want things to look worse for Donkey! :)

  19. Can we credit Dunn with an additional play made after the bad trap call tonight? :) -j