Table of Contents

Thursday, January 03, 2008

Dave Stieb and an historical Reds baseball proposal

So every once in a while (and probably not often enough), I realize exactly how clueless I really am.

Today, Tom Tango pointed out that David Stieb may have been the best pitcher of the '80's. And I'd never even heard of him. Not recognizing how good he was is one thing. But I'm absolutely embarrassed that I'd never even heard of the guy. I did know of his principle competitors--Blyleven, Welch, Hough, Valenzuela, Morris... But Stieb? I got nothing.

...which brings me to my next point, and something I was thinking about over the holidays. Given how absolutely clueless I am about anything that happened in baseball before 1990 (and, one can argue, before my post-strike "return" to baseball fandom in ~'98 or so), I'd like to start doing a bit of historical stats work around here from time to time. So at some point this season, I'm planning to take a look back at all 13 of the division/league/world series champion Reds teams. I'm probably also going to include the '94 and '99 teams, for a total of 15 articles. I'll probably work backwards, but who knows? Might be fun to start at the beginning with the 1882 American Association Champion team too. :)

Anyway, my main goal will be to apply player value estimates to those teams and see who shakes out as the top contributers. I'm sure that there will be some surprises in terms of how players rank on these teams, especially when we work in fielding. I'm going to try to avoid doing position-by-position commentary to keep the articles from getting too big...but everything I do tends to get too long anyway, ya know?

Here are some initial thoughts about methods:
  • For offense, I'm going to use a base runs model that is optimized for +-10 years around each team (or as many years within +-one decade of that team as possible). I'll then calculate custom league linear weights for MLB in the season in question and use those to estimate player runs created.
  • For pitching, I'll probably have to use a different base runs model (again, +-10 years) depending on what stats are available over at baseball reference. This is especially true for the pre-play index teams, where I won't have plate appearance data for pitchers.
  • For defense, I'll use the best data I can find. For more recent teams, that'll probably be either ZR or (best case) UZR. For teams prior to 1980, I'll probably have to use BPro's RAA....though I might look into doing something like Sean Smith's TotalZone or Dan Fox's SFR (don't hold your breath though). For really old teams, I may have to just assign approximate run values to errors or something. We'll see.
  • I'll include appropriate park factors, probably using U.S. Patriot's 1901-2004 5-year regressed park factors.
  • I'm going to need to figure out a way to estimate leverage index based on performance or something for my pitching valuations to have any merit. We'll see on that--Tango did offer some suggestions recently. My guess is that this will be the weakest methodological point about my series.
  • I'm thinking that I'll use runs above average throughout the analyses, rather than runs above replacement. The reason is that I'm very skeptical that replacement level has stayed anywhere close to the same level it is today (~73% of league average for hitters).
Anyway, should be fun. Hopefully someone out there will find it interesting, but if nothing else I'll get to learn a lot about some of the old Reds teams. :)

As a warning, I have no idea about the timeframe on any of this, as I've got a ton of stuff on my plate this semester...writing up my dissertation, applying to jobs, teaching my first lecture course as an adjunct faculty at a community college, etc. But it's something I'd like to do, and probably wouldn't be too time consuming if I can automate a lot of stuff in Excel.