On Baseball & The Reds: mailbag

Showing posts with label mailbag. Show all posts

Thursday, July 30, 2009

Mailbag: Pitching Questions

What is wrong with Harang this year? Image by Getty Images via Daylife

Had a few questioins arrive in the inbox. Questions may be paraphrased, and since I didn't ask permission to post them, they are anonymous (feel free to chime in if you wish).

Reader #1:

What is your take on Aaron Harang? Has he collapsed, or is he just unlucky?

Note: most of what follows is from Harang's player pages at FanGraphs.

Comparing the 2008-2009 Harang to the 2006-2007 Harang, there are two substantial differences I see. First is a decline in his strikeout rate from the mid-8 k/9's to the mid-7 k/9's. The second thing I see is a decline in ground ball rate from 39% or so to a fairly Milton-like 35%. All the while, though, his walk rate has stayed pretty constant--maybe a slight uptick, depending on how you measure it. But nevertheless, he still has a very fine k/bb rate, even if not as good as it used to be.

I recently read something about his slider's vertical movement having vanished. There is a pretty steep decline in slider vertical movement between 2007 and 2008/2009, which corresponds to a big drop in his slider's run value. Looking at this graph, it seems as though it happened about three starts before his DL stint. So there absolutely could be something to that, and it could explain why his strikeout rates have dropped a fair bit of late. But then again, there's an increase in horizontal movement in his change-up over the same time, without much change in its run value.

The various DIPSish metrics are mixed on him. His FIP is up from 3.7 to 4.4 or so, which is a steep climb. But his xFIP, which is a better predictor of future performance, is up only slightly, from 3.8 to 4.1 or so. So that's encouraging.

His tRA has gone from 3.75ish to 5.1 or so (yikes). But his tRA*, which includes appropriate regression and thus is a better predictor than tRA of future performance (FIP:xFIP::tRA:tRA*), has "only" increased from 4.1 to 4.9 or so. Again, not as bad as the unregressed data, though clearly not the level of performance we saw from him during 2006-2007.

He's essentially performing right at his ZiPS projection this year, which I would have said was pessimistic. Not surprisingly, his ZiPS the rest of the year is unchanged. That's the only projection system I have access to that gives in-season updates, but I think all the other projection systems (which had Harang in the 4.0-4.2 ERA range) will be closer to 4.35 ERA or so.

So I think the message here is that, at this point, Harang is still a quality starter, and can be expected to post something close to a league-average ERA (which is still a bit above-average for a starter). I'd put his expected WAR at around 2.5 to 3 by season's end, and probably about 2.5 next season. That's a quality pitcher. Not the 5 WAR beast he used to be, but still a good pitcher.

He's being paid $11M this year (expected 2.5 WAR) and $12.5M (~2.5-2.7 WAR, depending on the market) next year, which means he's pretty much being paid appropriately. No problem there either.

Reader response: Something you haven't mentioned is that Harang's DER is 2nd worst in the majors. I'd like to see some study on the affect that has on a pitcher's overall numbers. The reason is because if there is a pitcher that is getting less help from his defense, will his other numbers naturally decline simply because of the extra effort it takes to get through an inning. Put another way, a pitcher who basically has to get 3.5 outs an inning might be more likely to give up home runs or other hits than one that only has to get 3 outs an inning.

I didn't directly address the DER issue, though I did notice it. My feeling is that this year has to be luck-induced, as opposed to fielding induced. It's his worst DER of his career, but at least as far as Reds teams go, this has been the best defense he's ever had playing behind him. Granted, maybe there were some screw-ups while he was on the mound gven the small sample, but it makes no sense that his DER would get worse as his team's fielding undergoes a massive transformation unless it's largely a timing/"luck" issue.

Reading this again, I sort of missed the point as far as the psychology issue is concerned. My feeling is that you're right about it, but I don't know what the effect size is.

The next question comes from a different reader:

I am wondering if there is any way you could address this question/issue on your blog or maybe on BTB if you think it is interesting enough. You probably know about the Reds starters' struggles in the 1st inning of games this season. The stats, closely approximated about as near as I can tell are the following:

Bronson Arroyo: 9.46 ERA, 1.080 OPS
Aaron Harang: 7.36 ERA, .870 OPS (OPS worse after today's start against the Padres)
Johnny Cueto: 9.00 ERA (slightly worse), 1.150 OPS
Micah Owings: 7.50 ERA, .860 OPS
Edinson Volquez: 7.00 ERA, .913 OPS

Also, Harang, Cueto, and Arroyo are 3 of the top 5 worst pitchers in the league in BAA the first tie through the order, and the Reds have by far the worst 1st inning run differential in all of MLB. So my question is, do you have any idea or insight into why this might be? Any explanation for it at all? This phenomenon is very perplexing to me, as is the fact that apparently no adjustments are being made to correct it.

I honestly have no clue. Most teams do worse in the first inning than any other inning, simply because that's the only inning where you are guaranteed to see at least one (and often more) of the other team's best hitters. But why the Reds might be especially bad during the first inning isn't something I have an explanation for, unless they're not warming up enough or something (very unlikely, I'd think). Especially because the Reds rotation, while not awesome, isn't bad either.

It's not like the 8th-inning problem the Reds had a few years back, where the problem was that the Reds didn't have anyone competent enough to get the ball to Weathers in the 9th. There was a clear explanation for what we saw that year.

My honest guess is that it's noise. But I can't think of a clear way to test that further that would provide any sort of clear answer. Maybe some of the pitchf/x'ers can look at pitch motions and velocities in different innings and identify an issue. But that's not in my bag of tricks, at least not yet.

Sorry, that's the best I can do.

Sunday, June 21, 2009

WAR Q&A

Rob Dibble's career looks better with WAR than with win shares. Image via Wikipedia

I've been exchanging emails with a friend about Rally's WAR data (and competing systems) and it turned into a nice Q&A that I though I'd post here. Quotes are my friend, the rest is me.

1) Is it safe to say that WAR has a much higher bar than WS? (any bad regular can accumulate WS in a season, but with WAR a bad regular will be around +/- 1.0 WAR)

Yes, WS uses too low of a baseline. If a replacement player (e.g. Willie Bloomquist) plays enough, they will have positive win shares but a zero WAR. This is why Bill James has finally started developing "loss shares," which is a (clunky, in my view) way of dealing with this problem. Of course, James isn't publishing loss shares yet, so it's hard to know what to make of them (if anything). Rally's stuff also uses a better fielding metric than James's stuff does, so at this point I think it's safe to say that WS is vastly inferior to what Rally's selling (which I have purchased, fwiw--it's exactly what I was hoping for, although you do have to mesh it with a database...I stuck it in Lahman's database...to get the retroID's to match up with names. No biggie, as I'm trying to learn to use a database as it is).

2) WAR includes everything in some form - defense, baserunning, position adjustments, ballpark/league adjustments (which is not that different from WS except I don't think it had baserunning). Am I missing anything?

I'm not sure about league adjustments, though I wouldn't be surprised. It does include range, arms, dp turning (all w/ TotalZone), baserunning (beyond just sb's, also advancing 1st to third and stuff like that), park adjustments, etc. Rally's pretty awesome. :)

3) For the purposes of explaining WAR, what would you say the scale is? I'm guessing at something like:

2.0 - Useful Player
5.0 - Good Regular/All Star Candidate
7.0 - MVP Candidate
10.0 - MVP in a normal season

2.0 = MLB average player playing roughly a full year. I think good regular is ~3 WAR, but yeah 5 WAR = allstar. 10 WAR = Pujols, 12 WAR = Bonds. :)

Does that sound right? Is it different for pitchers?

Yeah, I think pitchers tend to score a tad lower, at least at the high end. Clemons had a few 10 WAR seasons, but Sabathia last year was ~7 WAR (across leagues) for example, as was Halladay. I still think of an average starter as ~2 war, though.

4) Are there differences between WAR on the various sites (or for that matter, BP's revised WARP figures). I know you favor WAR, just want to understand in a nutshell the differences.

Rally's WAR is calculated almost exactly as Tango does it, which is also how FanGraphs does it (though Rally has more baserunning info than fangraphs, plus reached-on-errors). Hitting=lwts, pitching=BsR + pythagopat, fielding=best available system. Everyone doing WAR is using the same baseline, which is ~2 wins below average per season (currently they're using 2.5 in the AL because it's so much better, but that only goes a few years back).

WARP, the new version at least (as of ~february), is much closer to WAR in its baseline (the old version assumed replacement players are atrocious fielders, which has zero empirical support). But it suffers from a few remaining flaws, like the use of offense-based position adjustments instead of fielding-based ones (there are some years when CF's hit better than LF's, and as a result WARP gives LF's a bonus over CF's...which is beyond absurd given the differences in fielding difficulty between those two positions). WARP also, I don't think, does not use a different baseline for relivers and starters (starting is harder than relieving, the same pitcher will put up better numbers as a reliever than a starter), and doesn't recognize leverage for closers like Rally's stuff does. WARP isn't terrible anymore, and I like it better than WS now. But WAR is a bit more current in its research underpinings, mostly because it's based on a collaborative effort of lots of extremely smart people (Tango, MGL, Rally, Patriot, and all the other people over there) instead of just one extremely smart guy (Clay Davenport).

If James's name wasn't attached to win shares -- or to runs created, for that matter -- it would have disappeared by now. But he's deservedly a giant, even if one who is sort of being left behind these days, and so his stuff remains in use even after it's obsolete.

5) Does WAR "favor" peak value vs. career value? To give an extreme example of two players Reds players - Ron Oester totaled 9.3 WAR as a Red vs. Ron Dibble's 9.4. In WS, Oester has 112, vs 63 for Dibble. Thoughts?

WAR is pure career value. You can look at peak value by pulling those years out and doing something to them, but career WAR is just career value.

The reason that Oester tops Dibble in WS but not WAR is because of the problem of win shares' baseline. You can be a crappy player for a long time and accumulate a lot of win shares, while you might not get any WAR. WAR requires a higher level of play to get "credit." It's not an exceptionally high bar, but it's higher than win shares' baseline. I'm sort of guessing here, but if a AAAA player is the baseline for WAR (which is about right), then a AA/AAA player might be the baseline for WS. That means that in WS, a AAAA player playing 20 years in the MLB would get a lot of WS but no WAR to speak of.

If and when James ever publishes his loss shares (probably in some new book or something), what we'll find is that Oester accumulated far more loss shares than Dibble. And as a result, his career contributions (win shares - loss shares) are roughly the same as Dibble's. In other words, when we eventually get the data we need from James, we'll finally get to the point that we're already at with Rally's WAR.

FWIW, I do tend to think that peak has to be taken into account, and not just accumulated career value, when you're talking the Hall of Fame. For that reason, I'd definitely rank Dibble over Oester. Dibble was pure badass dominance for a short number of years, which included that 1990 team. Oester was a decent player for a while, and had one genuinely good year in '85 (if totalzone is to believed, his fielding was better that year...probably in part random error though), but Oester was never ever ever dominant.

...

And then in another email a bit later...

For the hitters:

I assume Bat is batting runs, BSrun is base running runs (turns out Pete was a pretty good baserunner - I was always curious if he was "too aggressive" but apparently not). DP is a debit/credit for hitting into DP's.

On the defense...

Total Zone - is that runs above average? I hope so, or I'm going to have rethink Concepcion as a fielder.

IF DP - runs over/below average given opportunities?

OF Arm - same as above for OF

Catcher - is this some sort of fielding total for catchers based on CS, PB, WP?

Could all of these be totaled to create a total defensive runs saved?

YES to all above. That is what is done when calculating WAR.

The Total looks like the sum of all the runs, but then there is a Position adjustment. What does that represent?

It's an era-specific adjustment for the position a player plays. So, SS has the best fielders, and thus the highest level of competition. So, if you're an average fielding shortstop, you're an above-average fielder, so you get a bonus to account for that. Rally also made these adjustments specific over history, as the differences in quality among positions hasn't been constant.

Also, this is the only place that position comes into play. Offensive batted runs numbers are straight-up offense, without consideration for position. This is because the offensive-based adjustments you see in WARP or VORP, for example, assume all positions have equal talent levels. And that's not true. Second basemen and third basement are roughly equal fielders in modern baseball, but third basemen are better hitters. That makes 3B's a more talented position than 2B's. If you just do position adjustments by offense, you miss that and overrate 2B's relative to 3B's.

The "rep" column - is that replacement run level given the playing time for the season?

It's the difference between an average player and a replacement player, pro-rated for playing time. Since everything else is given vs. average,

And what is the RAR column before the WAR? I'm assuming it is a calculation of some form, but I can't figure it out.

So:
"Total" is offense (bat + BSrun)
RAR is everything (offense + all fielding + position adjustment + replacement)
WAR is RAR converted to wins (runs divided by league average runs per mlb game, ~9.4 r/g or so)

On the pitching side...

The runs must be how many the pitcher gave up, and the Rep runs is what a replacement pitcher would have given up given the innings.=Def is the defense behind the pitcher.

So pitchers get credit for pitching behind a bad defense correct? (The Big Red Machine pitchers get hurt if that is the case, since they all had a good defense behind them). How does that adjustment work?

Right. From Rally's site:

Def - Estimated runs saved by this pitcher's defense, using TotalZone range, DPs, OF arms, and catchers, prorated by the number of balls in play allowed by the pitcher

Awesome. This gets away from having to use something like FIP or xFIP or tRA to extract pitcher performances from fielding performances.

Likewise for the Leverage Index - do pitchers get "extra credit" for a high Leverage Index?

Relievers get partial credit for leverage index. I think they get bonus for any leverage above 1.5 or so. I don't know exactly how Rally does it, but Tango talks about how he does it here: http://www.insidethebook.com/ee/index.php/site/comments/how_to_calculate_war/

I'm doing the same thing now with my Reds stuff, though I do it for all relievers and Tango only does it for closers. I like my way, because special relievers like Marmol get extra credit, which I think is appropriate.

----------------
Thanks for the great conversation!

Table of Contents

Thursday, July 30, 2009

Mailbag: Pitching Questions

Sunday, June 21, 2009

WAR Q&A