I'll post an updated version of this study when I get a chance. The results are far less surprising, and much more in line with other work on lineups than these were.
There was a chorus of complaints from Reds' faithful over the Reds' opening-day lineup, and this has continued in reaction to subsequent lineups over the Reds' first two weeks of play. I asked folks to submit to me what they felt the opening day lineup should have been in a "Can you make a better lineup than Dusty Baker" contest of sorts.
I've now taken those lineups and, with the help of PECOTA '08 projections, I've plugged them all into John Beamer's Markov Chain spreadsheet that came with the Hardball Times 2008 Annual. The results surprised me, and I expect that they'll surprise a good number of you as well.
Background - Evaluating Lineups (you can skip if you want)
The past several years have seen increasingly sophisticated work done in the area of lineup construction. Among the most widely-publicized tools that stems from this effort is David Pinto's lineup tool, which I first discussed roughly two years ago. It is based on a set of studies that used linear regressions to relate player OBP and SLG at each lineup position to runs scored. Those regressions indicated that we should make some changes to how lineups have traditionally been designed. Around the same time, The Book was published; it used a different approach, but arrived at many of the same conclusions. Check out my old post for details.
The problem with approach upon which Pinto's tool, in particular, is built is that a lineup is an incredibly dynamic thing. Regressions just report typical relationships between OBP and SLG across different lineup spots based on how MLB managers have filled out their lineup cards in the past. That's a different thing from having a tool that you can use to try out radically different lineups, as Pinto's tool permits one to do. For example, if you hit the pitcher 1st, that will have large effect on the run-producing opportunities of the #2 and #3 hitters compared to hitting an on-base machine like Scott Hatteberg 1st. In other words, the player placed into each of the lineup slots will have direct effects on the opportunities of all other players in the lineup, and the consequences of a seemingly minor substitution might not be immediately apparent. Regression simply will not capture these interactions.
Enter Markov Chains. Markov chain models are models that organize complicated processes into steps. At each step, you input a certain probability that a variety of specific events will occur. The result is a series of branching event chains, which are then summarized according to their probabilities. John Beamer described them here in his introduction to his model.
It turns out that this is a great way to model a lineup's performance--if you can input the chances of different offensive events happening for each player (using, for example, PECOTA projections), you can have Markov step through a lineup throughout a game, keeping track of outs and innings, all the while keeping track of the entire range of possibilities in terms of offensive production. In other words, if you can design a Markov model that fits the game of baseball, and give it accurate data, it will provide you with more precise estimates of offensive production by a lineup than with the regression coefficients can ever hope to do because it actually attempts to simulate baseball.
In the 2008 THT Annual, John Beamer released an excel-based Markov chain model that is apparently capable of accurately modeling baseball just as I described above. It is probably the best publically-available tool for lineup analysis.
2008 Opening Day
First, let's look at the Opening Day lineup, isolating our choices to only those players that Dusty chose to play (i.e. no Joey Votto, no Jay Bruce, etc). That way, we can focus more on the effect of moving players around the batting order, and less on player substitutions.
Below is a table of the Reds' opening day players, along with their 2008 PECOTA projections broken down into my favorite set of diagnostic stats:
|Pitcher||Pitcher||363||35%||3%||0.202||0.123||0.156||0.153||0.030||0.309||--- ||--- ||--- |
I received 14 different lineups from users both here and at RedsZone, and I threw in a few of mine own as well. There are actually 9! = 362,880 possible lineup combinations of these nine players (counting the pitcher slot) that we could theoretically try, but I think these represent a good part of the diversity of recommendations that most folks might like to try. Below, I list all of those lineups, as well as the Markov-based estimate of runs per game that those lineups could be expected to provide. The +/- Baker column lists the season-level differences in performance of each lineup compared to Dusty Baker's true opening day lineup (in italics).
All I can say is "wow." Dusty's lineup didn't come out as #1, but it was darn hard to beat, and only an estimated 5 runs per season behind Bluzer's top-rated lineup.
Think about that. Baker's lineup violates one of the biggest "rules" for lineup construction that us stat people harp on--his leadoff hitter is projected to have a miserable 0.307 OBP this season. And yet, the interactions between players in his lineup are such that his lineup results in more wins per season than most other variants...at least, according to Markov. My own lineups, which I designed based largely on the lineup chapter in The Book, rated as a fairly middle-of-the-pack lineup, and came out a good 17 runs (~1.5 wins) behind Baker's model. And some of the user-submitted lineups, which look very reasonable to my eye, came out more than 30 runs per season behind Baker's. Again, "wow."
A few other observations and interpretations:
- The range of performances of different lineups was about 38 runs per season, despite all lineups featuring the exact same players. That's almost 4 wins worth of variation! More than I expected to see.
- Lineups with Keppinger leading off did better than lineups with Hatteberg leading off, despite them being rather similar hitters according to PECOTA--both are high OBP guys, though Hatteberg projects to have more power.
- Most of the "best" lineups have Phillips batting in the 5th spot or higher in the lineup. Many of the "worst" lineups (like mine) have Phillips batting in the 5th spot or lower.
- None of the lineups that bat Dunn in the #2 spot do very well, despite his crazy-high OBP.
- My "idiot" lineups, in which I just ranked players by a rate stat like OPS, did pretty well for themselves (better than my "smart" ones). Ranks by AVG did particularly well, despite batting Dunn 8th!
|Lineup Slot - # Innings led off|
|Top 5 Lineups |
|Bottom 5 Lineups |
- Baker's lineup is very different from the others:
- It has the lowest frequency with which the leadoff hitter would lead off innings, though only by a small amount. Still, it might be enough to help diminish the problem of Patterson leading off.
- It has the highest frequency with which the #3 hitter would lead off innings. The #3 hole is the spot in the lineup that most frequently bats with two outs and runners on in real baseball, but Baker's lineup might reduce this effect compared to the others.
- Baker's lineup also has the pitcher leading off innings more often than any other lineup. That can't possibly be a good thing, can it?
- Top-5 lineups vs. Bottom-5 lineups
- The #4 slot leads off innings an awful lot for the bottom-5 set of lineups. This is typically a power hitter spot (either Griffey or Encarnacion in these cases), so you'd normally want runners on base when they're hitting. This problem is minimized in Baker's lineup.
- One really neat finding: the standard deviation among lineup slots in the frequency with which they led off was consistently lower for "good" lineups then for "bad" ones. This might mean that the "bad" lineups have more bottlenecks that tend to kill innings, resulting in certain lineup spots being more likely to lead off the next inning. So, perhaps good lineups distribute the best hitters around the lineup more than poor ones?
Best of the Organization Lineups
The other set of lineups I requested involved folks' choice of any players from within the Reds' organization. I received 26 of these lineups. Here they are, again along with Dusty Baker's opening day lineup, as well as a few of his others from the first week.
- Player choice matters: here we see that Baker's opening day lineup can be beat regularly by employing Joey Votto and Jay Bruce in favor of Scott Hatteberg and Corey Patterson.
- The difference between the best and worst lineups in this case was about 5 wins. That's lower than I expected given the 4-win range in the opening day dataset, but then again the personnel differences among these lineups aren't that dramatic.
- I was gratified to see that my "exploitative" lineup, which takes advantage of the lack of information in the model about L/R splits and strings together Votto, Dunn, and Bruce in the top-3 slots, did quite well. So I'm not completely hopeless...
- Mike Grayson's top-rated lineup has Jay Bruce and his projected 0.336 OBP in the leadoff slot, again indicating that the issue of OBP in the 1-hole is less of a big deal than it's often made out to be.
|Lineup Slot - Innings led off|
|Top 5 lineups|
|Bottom 5 lineups|
Not the same trend. Mostly. The top lineups continue to have a fairly low standard deviation (at least compared to the bottom 5 from the opening day lineups), but we see low variation here among the bottom lineups as well. What's different?
Well, the bottom lineups tend to have inferior players to the better lineups this time around. Norris Hopper appears frequently instead of Corey Patterson in these lineups, which is a poor trade according to PECOTA. And Baker's game 3 lineup includes Freel and Bako instead of Patterson and Valentin.
Well, there's obviously a lot more to do on this front. I feel like we've really just started to scratch the surface on this issue. But I wanted to end with a summary of some of the tentative conclusions that I'm taking from this work.
1. There isn't one best way to make a successful lineup. Lineup interactions are such that there may be several very different styles of lineups that each result in similar overall performance.
2. In general, playing the best players is more important than lineup order. That's not to say that lineup order doesn't matter, but a poorly structured lineup with great players can beat a perfectly structured lineup with weak players.
3. Spreading one's best (and worst) hitters out to prevent bottlenecks may be an overlooked yet highly influential means of improving the performance of a lineup. The first dataset indicates to me that it might be more important than some of the other things we tend to worry about (like OBP in the leadoff slot). At the very least, it's worth further study.
4. Seemingly small differences between players can cause substantial differences in lineup performance when they are swapped between lineup spots. It may be that different sets of players may require substantially different lineup designs for optimal production.
5. We fans may not know as much about lineup construction as we think we do. In my case, at least, I'm feeling pretty clueless after working through this project. This post by J.C. Bradbury seems pretty apt. So, I'm inclined to give Dusty Baker the benefit of the doubt with his lineup order at this point...though I reserve the right to complain about personnel choices!
That's about as far as I'm willing and able to go for now. If you have suggestions, specific tests (or specific lineups) you'd like to see done, etc, let me know. I'm happy to continue working on this stuff as my (limited) time permits.
A few caveats and notes:
- All of the above results are obviously dependent on the quality and reliability of the Markov model upon which they're based. It seems really solid, but we should always keep this in mind. Again, I'll refer to the Bradbury post.
- As implied above, a different set of players--say, #1-#9 lineup splits for NL teams...not to mention AL teams--might result in very different findings. Such is the nature of this sort of thing.
- For pitchers and their pinch hitters, I just used 2007 Reds' pitcher totals. Maybe I should have used #9 hitter totals instead, as they would recognize the contributions of pinch hitters. But I didn't think about that until about a minute before hitting "publish post."
- With some assistance from John Beamer (thanks John!), I modified the spreadsheet to automatically input the actual % times each hitter led off an inning, based on the output of the model. You'll need to do this as well if you want to replicate my results. Drop me a line and I can help ya out on that.
- I used the standard baserunning tables in the spreadsheet, and I did click the Update SBA button after constructing each lineup. I have no idea how much modifying those baserunning tables might affect the results.
- I also just wanted to again thank John Beamer for publishing his Markov model with the Hardball Times Annual. If nothing else, it's provided a lot of food for thought!