On Baseball & The Reds: Markov: Dusty Baker's lineups aren't half bad

Sunday, April 13, 2008

Markov: Dusty Baker's lineups aren't half bad

Update: Due to several errors with how I was using John Beamer's Markov model, this study is frankly a load of hooey. I am leaving it here for archival purposes, but please disregard pretty much everything here. Sorry about that--I had done several checks to be sure I was "doing it right," but it turns out that the specific ways I was thinking about the input were absolutely incorrect. Embarassing to say the least, but that's what happens with research now and then...

I'll post an updated version of this study when I get a chance. The results are far less surprising, and much more in line with other work on lineups than these were.

There was a chorus of complaints from Reds' faithful over the Reds' opening-day lineup, and this has continued in reaction to subsequent lineups over the Reds' first two weeks of play. I asked folks to submit to me what they felt the opening day lineup should have been in a "Can you make a better lineup than Dusty Baker" contest of sorts.

I've now taken those lineups and, with the help of PECOTA '08 projections, I've plugged them all into John Beamer's Markov Chain spreadsheet that came with the Hardball Times 2008 Annual. The results surprised me, and I expect that they'll surprise a good number of you as well.

Background - Evaluating Lineups (you can skip if you want)

The past several years have seen increasingly sophisticated work done in the area of lineup construction. Among the most widely-publicized tools that stems from this effort is David Pinto's lineup tool, which I first discussed roughly two years ago. It is based on a set of studies that used linear regressions to relate player OBP and SLG at each lineup position to runs scored. Those regressions indicated that we should make some changes to how lineups have traditionally been designed. Around the same time, The Book was published; it used a different approach, but arrived at many of the same conclusions. Check out my old post for details.

The problem with approach upon which Pinto's tool, in particular, is built is that a lineup is an incredibly dynamic thing. Regressions just report typical relationships between OBP and SLG across different lineup spots based on how MLB managers have filled out their lineup cards in the past. That's a different thing from having a tool that you can use to try out radically different lineups, as Pinto's tool permits one to do. For example, if you hit the pitcher 1st, that will have large effect on the run-producing opportunities of the #2 and #3 hitters compared to hitting an on-base machine like Scott Hatteberg 1st. In other words, the player placed into each of the lineup slots will have direct effects on the opportunities of all other players in the lineup, and the consequences of a seemingly minor substitution might not be immediately apparent. Regression simply will not capture these interactions.

Enter Markov Chains. Markov chain models are models that organize complicated processes into steps. At each step, you input a certain probability that a variety of specific events will occur. The result is a series of branching event chains, which are then summarized according to their probabilities. John Beamer described them here in his introduction to his model.

It turns out that this is a great way to model a lineup's performance--if you can input the chances of different offensive events happening for each player (using, for example, PECOTA projections), you can have Markov step through a lineup throughout a game, keeping track of outs and innings, all the while keeping track of the entire range of possibilities in terms of offensive production. In other words, if you can design a Markov model that fits the game of baseball, and give it accurate data, it will provide you with more precise estimates of offensive production by a lineup than with the regression coefficients can ever hope to do because it actually attempts to simulate baseball.

In the 2008 THT Annual, John Beamer released an excel-based Markov chain model that is apparently capable of accurately modeling baseball just as I described above. It is probably the best publically-available tool for lineup analysis.

2008 Opening Day

First, let's look at the Opening Day lineup, isolating our choices to only those players that Dusty chose to play (i.e. no Joey Votto, no Jay Bruce, etc). That way, we can focus more on the effect of moving players around the batting order, and less on player substitutions.

Below is a table of the Reds' opening day players, along with their 2008 PECOTA projections broken down into my favorite set of diagnostic stats:

Last	First	PA	%K	%BB	BABIP	AVG	OBP	SLG	ISO	OPS	lwts_RC	R/G	wOBA
Patterson	Corey	458	16%	5%	0.306	0.268	0.307	0.402	0.134	0.709	56.3	4.76	0.300
Keppinger	Jeff	514	6%	8%	0.317	0.305	0.364	0.418	0.113	0.781	70.6	5.82	0.336
Griffey	Ken	435	16%	11%	0.283	0.268	0.350	0.480	0.213	0.830	65.7	6.20	0.354
Phillips	Brandon	629	16%	6%	0.302	0.274	0.325	0.444	0.170	0.769	86.4	5.43	0.324
Dunn	Adam	579	25%	16%	0.298	0.261	0.388	0.549	0.288	0.937	103.0	7.73	0.392
Encarnacion	Edwin	561	16%	8%	0.308	0.285	0.356	0.493	0.208	0.850	88.5	6.51	0.354
Hatteberg	Scott	278	9%	11%	0.296	0.285	0.368	0.440	0.155	0.808	40.3	6.12	0.352
Valentin	Javier	208	12%	9%	0.287	0.269	0.333	0.424	0.155	0.757	27.1	5.22	0.327
Pitcher	Pitcher	363	35%	3%	0.202	0.123	0.156	0.153	0.030	0.309	---	---	---

Note: Markov pays no attention to plate appearances, except to generate frequencies of each even happening. Therefore, it doesn't matter if a player has 200 or 500 PA's, what matters are the number of singles, doubles, strikeouts, walks, etc, per plate appearance. Also, Markov doesn't know about lefty/right splits--it just assumes an average pitcher, who is mostly (but not entirely) right-handed. :)

I received 14 different lineups from users both here and at RedsZone, and I threw in a few of mine own as well. There are actually 9! = 362,880 possible lineup combinations of these nine players (counting the pitcher slot) that we could theoretically try, but I think these represent a good part of the diversity of recommendations that most folks might like to try. Below, I list all of those lineups, as well as the Markov-based estimate of runs per game that those lineups could be expected to provide. The +/- Baker column lists the season-level differences in performance of each lineup compared to Dusty Baker's true opening day lineup (in italics).

Name

1st

2nd

3rd

4th

5th

6th

7th

8th

9th

R/G

R/162G

vBaker

Bluzer-OD

Keppinger

Hatteberg

Phillips

Dunn

Griffey

Encarnacion

Patterson

Valentin

Pitcher

4.90

793.6

4.5

Chris-OD2

Keppinger

Encarnacion

Dunn

Phillips

Griffey

Hatteberg

Patterson

Valentin

Pitcher

4.89

792.9

3.8

Pickoff-OD

Keppinger

Encarnacion

Dunn

Griffey

Phillips

Hatteberg

Patterson

Valentin

Pitcher

4.88

790.5

1.4

Baker-OD

Patterson

Keppinger

Griffey

Phillips

Dunn

Encarnacion

Hatteberg

Valentin

Pitcher

4.87

789.1

0.0

AVG-rank

Keppinger

Encarnacion

Hatteberg

Phillips

Patterson

Griffey

Valentin

Dunn

Pitcher

4.86

788.1

-1.0

brad-OD1

Keppinger

Hatteberg

Griffey

Dunn

Encarnacion

Phillips

Patterson

Valentin

Pitcher

4.86

787.6

-1.5

SLG-rank

Dunn

Encarnacion

Griffey

Phillips

Hatteberg

Patterson

Valentin

Keppinger

Pitcher

4.86

786.8

-2.3

OPS-rank

Dunn

Encarnacion

Griffey

Hatteberg

Keppinger

Phillips

Valentin

Patterson

Pitcher

4.84

783.4

-5.7

OBP-rank

Dunn

Keppinger

Hatteberg

Encarnacion

Griffey

Valentin

Phillips

Patterson

Pitcher

4.83

782.7

-6.4

texasdave-OD

Keppinger

Hatteberg

Dunn

Encarnacion

Griffey

Phillips

Valentin

Patterson

Pitcher

4.80

777.7

-11.4

Chris-OD

Keppinger

Hatteberg

Encarnacion

Dunn

Griffey

Valentin

Phillips

Patterson

Pitcher

4.79

775.7

-13.4

jinaz-OD

Encarnacion

Dunn

Keppinger

Griffey

Hatteberg

Phillips

Valentin

Patterson

Pitcher

4.77

772.2

-16.9

jinaz-OD-exploit

Encarnacion

Dunn

Hatteberg

Griffey

Keppinger

Phillips

Valentin

Patterson

Pitcher

4.76

771.5

-17.6

Trace's Daddy-OD

Hatteberg

Dunn

Griffey

Phillips

Encarnacion

Keppinger

Valentin

Patterson

Pitcher

4.76

770.7

-18.4

justincredible-OD

Hatteberg

Keppinger

Griffey

Dunn

Phillips

Encarnacion

Patterson

Valentin

Pitcher

4.75

770.0

-19.1

OesterPoster-OD

Hatteberg

Keppinger

Griffey

Dunn

Phillips

Patterson

Encarnacion

Valentin

Pitcher

4.74

768.0

-21.1

mlbfan30-OD

Hatteberg

Keppinger

Dunn

Encarnacion

Griffey

Phillips

Patterson

Valentin

Pitcher

4.72

764.8

-24.3

Degenerate-OD

Hatteberg

Keppinger

Dunn

Encarnacion

Griffey

Phillips

Patterson

Valentin

Pitcher

4.72

764.8

-24.3

fareast-OD

Hatteberg

Keppinger

Phillips

Griffey

Dunn

Encarnacion

Valentin

Patterson

Pitcher

4.70

761.4

-27.7

brad-OD2

Hatteberg

Dunn

Encarnacion

Griffey

Phillips

Keppinger

Patterson

Valentin

Pitcher

4.69

760.6

-28.5

joel-OD1

Hatteberg

Dunn

Encarnacion

Griffey

Phillips

Patterson

Valentin

Pitcher

Keppinger

4.68

758.9

-30.2

redsmanrick-OD

Hatteberg

Dunn

Encarnacion

Griffey

Keppinger

Phillips

Patterson

Valentin

Pitcher

4.66

755.5

-33.6

All I can say is "wow." Dusty's lineup didn't come out as #1, but it was darn hard to beat, and only an estimated 5 runs per season behind Bluzer's top-rated lineup.

Think about that. Baker's lineup violates one of the biggest "rules" for lineup construction that us stat people harp on--his leadoff hitter is projected to have a miserable 0.307 OBP this season. And yet, the interactions between players in his lineup are such that his lineup results in more wins per season than most other variants...at least, according to Markov. My own lineups, which I designed based largely on the lineup chapter in The Book, rated as a fairly middle-of-the-pack lineup, and came out a good 17 runs (~1.5 wins) behind Baker's model. And some of the user-submitted lineups, which look very reasonable to my eye, came out more than 30 runs per season behind Baker's. Again, "wow."

A few other observations and interpretations:

The range of performances of different lineups was about 38 runs per season, despite all lineups featuring the exact same players. That's almost 4 wins worth of variation! More than I expected to see.
Lineups with Keppinger leading off did better than lineups with Hatteberg leading off, despite them being rather similar hitters according to PECOTA--both are high OBP guys, though Hatteberg projects to have more power.
Most of the "best" lineups have Phillips batting in the 5th spot or higher in the lineup. Many of the "worst" lineups (like mine) have Phillips batting in the 5th spot or lower.
None of the lineups that bat Dunn in the #2 spot do very well, despite his crazy-high OBP.
My "idiot" lineups, in which I just ranked players by a rate stat like OPS, did pretty well for themselves (better than my "smart" ones). Ranks by AVG did particularly well, despite batting Dunn 8th!

Markov also can report how often each lineup slot will lead off an inning for any lineup configuration. Here is Baker's lineup, as well as the top-5 and bottom-5 lineups, broken down by how many innings each slot led off (on average):

	Lineup Slot - # Innings led off
Lineup Name	1	2	3	4	5	6	7	8	9	StDev
Baker-OD	1.76	0.77	0.81	0.95	1.04	0.86	0.89	0.85	1.06	0.30
Top 5 Lineups
Bluzer-OD	1.80	0.78	0.79	1.08	0.95	0.85	1.15	0.88	0.72	0.33
Chris-OD2	1.81	0.77	0.79	1.03	1.02	0.88	1.13	0.83	0.73	0.33
Pickoff-OD	1.86	0.77	0.78	1.03	0.97	0.89	1.14	0.83	0.73	0.35
AVG-rank	1.78	0.77	0.68	1.01	1.05	0.89	1.13	0.92	0.78	0.33
brad-OD1	1.8	0.8	0.8	1.1	1	0.9	1.1	0.9	0.7	0.33
Bottom 5 Lineups
Degenerate-OD	1.80	0.77	0.79	1.38	0.93	0.86	0.90	0.83	0.73	0.36
fareast-OD	1.82	0.76	0.68	1.37	0.95	0.82	0.90	0.85	0.85	0.36
brad-OD2	1.84	0.77	0.79	1.38	0.93	0.80	0.86	0.90	0.72	0.37
joel-OD1	1.82	0.82	0.67	1.37	0.96	0.80	0.87	0.90	0.78	0.37
redsmanrick-OD	1.84	0.77	0.79	1.38	0.93	0.81	0.87	0.89	0.72	0.37

Observations:

Baker's lineup is very different from the others:

It has the lowest frequency with which the leadoff hitter would lead off innings, though only by a small amount. Still, it might be enough to help diminish the problem of Patterson leading off.
It has the highest frequency with which the #3 hitter would lead off innings. The #3 hole is the spot in the lineup that most frequently bats with two outs and runners on in real baseball, but Baker's lineup might reduce this effect compared to the others.
Baker's lineup also has the pitcher leading off innings more often than any other lineup. That can't possibly be a good thing, can it?

Top-5 lineups vs. Bottom-5 lineups

The #4 slot leads off innings an awful lot for the bottom-5 set of lineups. This is typically a power hitter spot (either Griffey or Encarnacion in these cases), so you'd normally want runners on base when they're hitting. This problem is minimized in Baker's lineup.
One really neat finding: the standard deviation among lineup slots in the frequency with which they led off was consistently lower for "good" lineups then for "bad" ones. This might mean that the "bad" lineups have more bottlenecks that tend to kill innings, resulting in certain lineup spots being more likely to lead off the next inning. So, perhaps good lineups distribute the best hitters around the lineup more than poor ones?

...I'd like to do a bit more with these data, breaking down the different lineups based on their rate stats, but this has taken long enough to get written as it is. So, let's move on to the Best of the Organization lineups.

Best of the Organization Lineups

The other set of lineups I requested involved folks' choice of any players from within the Reds' organization. I received 26 of these lineups. Here they are, again along with Dusty Baker's opening day lineup, as well as a few of his others from the first week.

Name

1st

2nd

3rd

4th

5th

6th

7th

8th

9th

R/G

R/162G

vBaker

mikegrayson-Best

Bruce

Keppinger

Griffey

Dunn

Encarnacion

Phillips

Votto

Ross

Pitcher

5.01

811.3

22.2

brad-Best

Keppinger

Votto

Bruce

Dunn

Griffey

Encarnacion

Phillips

Valentin

Pitcher

5.00

810.3

21.2

Chris-Best1

Keppinger

Dunn

Griffey

Bruce

Votto

Encarnacion

Phillips

Valentin

Pitcher

4.99

808.8

19.7

jinaz-Best-exploit

Votto

Dunn

Bruce

Encarnacion

Griffey

Keppinger

Phillips

Valentin

Pitcher

4.95

802.0

12.9

ED44-Best2

Keppinger

Votto

Dunn

Phillips

Griffey

Encarnacion

Patterson

Valentin

Pitcher

4.95

801.9

12.8

Alex-Best

Keppinger

Votto

Dunn

Phillips

Griffey

Encarnacion

Patterson

Valentin

Pitcher

4.95

801.9

12.8

Alex-best3

Freel

Keppinger

Dunn

Phillips

Griffey

Encarnacion

Votto

Valentin

Pitcher

4.94

799.7

10.6

Baker-Game4

Patterson

Keppinger

Griffey

Phillips

Dunn

Encarnacion

Votto

Valentin

Pitcher

4.93

798.0

8.9

ED44-Best

Keppinger

Bruce

Dunn

Phillips

Griffey

Votto

Encarnacion

Valentin

Pitcher

4.92

797.7

8.6

Chris-Best2

Bruce

Keppinger

Encarnacion

Dunn

A_Phillips

Griffey

Phillips

Hanigan

Pitcher

4.90

794.0

4.9

justincredible-Best

Votto

Keppinger

Griffey

Dunn

Phillips

Bruce

Encarnacion

Ross

Pitcher

4.88

789.8

0.7

mlbfan30-B

Votto

Keppinger

Dunn

Encarnacion

Griffey

Phillips

Bruce

Valentin

Pitcher

4.88

789.8

0.7

Baker-OD

Patterson

Keppinger

Griffey

Phillips

Dunn

Encarnacion

Hatteberg

Valentin

Pitcher

4.87

789.1

0.0

Alex-best2

Hopper

Keppinger

Dunn

Phillips

Griffey

Encarnacion

Votto

Valentin

Pitcher

4.85

785.7

-3.4

redsmanrick-best

Votto

Dunn

Encarnacion

Griffey

Keppinger

Bruce

Phillips

Ross

Pitcher

4.85

785.6

-3.5

Enquirer-vsR

Votto

Dunn

Encarnacion

Bruce

Griffey

Phillips

Valentin

Keppinger

Pitcher

4.84

783.9

-5.2

jinaz-Best

Votto

Encarnacion

Dunn

Phillips

Griffey

Bruce

Keppinger

Valentin

Pitcher

4.83

782.0

-7.1

joel-Best1

Votto

Dunn

Encarnacion

Griffey

Bruce

Phillips

Valentin

Pitcher

Keppinger

4.81

779.8

-9.3

brad-Best

Votto

Dunn

Encarnacion

Griffey

Bruce

Phillips

Keppinger

Valentin

Pitcher

4.80

778.4

-10.7

joel-Best2

Keppinger

Dunn

Phillips

Bruce

Encarnacion

Griffey

Ross

Pitcher

Gonzalez

4.80

778.2

-10.9

joel-Best3

Keppinger

Dunn

Phillips

Encarnacion

Griffey

Patterson

Ross

Pitcher

Hatteberg

4.79

776.5

-12.6

DannyB-B

Keppinger

Dunn

Griffey

Encarnacion

Votto

Phillips

Hopper

Valentin

Pitcher

4.77

773.2

-15.9

texasdave-B

Keppinger

Votto

Dunn

Encarnacion

Griffey

Phillips

Ross

Hopper

Pitcher

4.76

771.3

-17.8

fareast-Best

Votto

Phillips

Griffey

Dunn

Bruce

Encarnacion

Gonzalez

Ross

Pitcher

4.76

770.5

-18.6

PastAndPending-B2

Freel

Keppinger

Bruce

Griffey

Phillips

Dunn

Votto

Ross

Pitcher

4.74

767.9

-21.2

Bluzer-B

Hopper

Keppinger

Dunn

Phillips

Griffey

Encarnacion

Valentin

Pitcher

Freel

4.74

767.4

-21.7

Baker-Game2

Patterson

Keppinger

Griffey

Phillips

Dunn

Encarnacion

Votto

Bako

Pitcher

4.73

766.3

-22.8

PastAndPending-B

Hopper

Keppinger

Griffey

Phillips

Dunn

Votto

Encarnacion

Valentin

Pitcher

4.72

764.9

-24.2

JamesB-B

Hopper

Keppinger

Griffey

Dunn

Phillips

Votto

Encarnacion

Valentin

Pitcher

4.72

764.4

-24.7

redsmanrick-best2

Keppinger

Phillips

Dunn

Encarnacion

Griffey

Ross

Votto

Pitcher

Hopper

4.71

763.2

-25.9

DannyB-B2

Hopper

Dunn

Griffey

Encarnacion

Votto

Phillips

Gonzalez

Ross

Pitcher

4.69

759.3

-29.8

Enquirer-vsL

Keppinger

Dunn

Phillips

Bruce

Encarnacion

Griffey

Ross

Gonzalez

Pitcher

4.68

758.7

-30.4

Baker-Game3

Freel

Keppinger

Griffey

Phillips

Dunn

Encarnacion

Hatteberg

Bako

Pitcher

4.63

750.1

-39.0

Thoughts:

Player choice matters: here we see that Baker's opening day lineup can be beat regularly by employing Joey Votto and Jay Bruce in favor of Scott Hatteberg and Corey Patterson.
The difference between the best and worst lineups in this case was about 5 wins. That's lower than I expected given the 4-win range in the opening day dataset, but then again the personnel differences among these lineups aren't that dramatic.
I was gratified to see that my "exploitative" lineup, which takes advantage of the lack of information in the model about L/R splits and strings together Votto, Dunn, and Bruce in the top-3 slots, did quite well. So I'm not completely hopeless...
Mike Grayson's top-rated lineup has Jay Bruce and his projected 0.336 OBP in the leadoff slot, again indicating that the issue of OBP in the 1-hole is less of a big deal than it's often made out to be.

Let's look at the distribution of how often hitters lead off innings from this dataset and see if we see a similar trend to the prior dataset:

	Lineup Slot - Innings led off
Lineup Name	1	2	3	4	5	6	7	8	9	StDev
Baker-OD	1.76	0.77	0.81	0.95	1.04	0.86	0.89	0.85	1.06	0.30
Top 5 lineups
mikegrayson-Best	1.77	0.77	0.78	0.94	0.98	0.88	0.91	0.90	1.06	0.30
brad-Best	1.80	0.78	0.77	1.07	0.89	0.88	1.17	0.88	0.77	0.33
Chris-Best1	1.79	0.77	0.77	1.03	0.89	0.88	1.17	0.87	0.83	0.32
jinaz-Best-exploit	1.78	0.77	0.77	1.38	0.87	0.89	0.87	0.88	0.79	0.35
ED44-Best2	1.80	0.78	0.79	1.08	1.02	0.83	1.14	0.84	0.72	0.33
Bottom 5 lineups
JamesB-B	1.77	0.78	0.79	1.03	0.97	0.79	0.91	0.92	1.05	0.31
redsmanrick-best2	1.79	0.80	0.73	0.95	1.06	0.91	1.12	0.84	0.80	0.32
DannyB-B2	1.75	0.76	0.78	1.03	1.02	0.88	0.77	0.93	1.07	0.30
Enquirer-vsL	1.83	0.76	0.67	1.36	0.95	0.81	0.89	0.91	0.82	0.37
Baker-Game3	1.76	0.77	0.78	0.95	1.05	0.86	0.91	0.87	1.07	0.30

Not the same trend. Mostly. The top lineups continue to have a fairly low standard deviation (at least compared to the bottom 5 from the opening day lineups), but we see low variation here among the bottom lineups as well. What's different?

Well, the bottom lineups tend to have inferior players to the better lineups this time around. Norris Hopper appears frequently instead of Corey Patterson in these lineups, which is a poor trade according to PECOTA. And Baker's game 3 lineup includes Freel and Bako instead of Patterson and Valentin.

...

Well, there's obviously a lot more to do on this front. I feel like we've really just started to scratch the surface on this issue. But I wanted to end with a summary of some of the tentative conclusions that I'm taking from this work.

1. There isn't one best way to make a successful lineup. Lineup interactions are such that there may be several very different styles of lineups that each result in similar overall performance.

2. In general, playing the best players is more important than lineup order. That's not to say that lineup order doesn't matter, but a poorly structured lineup with great players can beat a perfectly structured lineup with weak players.

3. Spreading one's best (and worst) hitters out to prevent bottlenecks may be an overlooked yet highly influential means of improving the performance of a lineup. The first dataset indicates to me that it might be more important than some of the other things we tend to worry about (like OBP in the leadoff slot). At the very least, it's worth further study.

4. Seemingly small differences between players can cause substantial differences in lineup performance when they are swapped between lineup spots. It may be that different sets of players may require substantially different lineup designs for optimal production.

5. We fans may not know as much about lineup construction as we think we do. In my case, at least, I'm feeling pretty clueless after working through this project. This post by J.C. Bradbury seems pretty apt. So, I'm inclined to give Dusty Baker the benefit of the doubt with his lineup order at this point...though I reserve the right to complain about personnel choices!

That's about as far as I'm willing and able to go for now. If you have suggestions, specific tests (or specific lineups) you'd like to see done, etc, let me know. I'm happy to continue working on this stuff as my (limited) time permits.

A few caveats and notes:

All of the above results are obviously dependent on the quality and reliability of the Markov model upon which they're based. It seems really solid, but we should always keep this in mind. Again, I'll refer to the Bradbury post.
As implied above, a different set of players--say, #1-#9 lineup splits for NL teams...not to mention AL teams--might result in very different findings. Such is the nature of this sort of thing.
For pitchers and their pinch hitters, I just used 2007 Reds' pitcher totals. Maybe I should have used #9 hitter totals instead, as they would recognize the contributions of pinch hitters. But I didn't think about that until about a minute before hitting "publish post."
With some assistance from John Beamer (thanks John!), I modified the spreadsheet to automatically input the actual % times each hitter led off an inning, based on the output of the model. You'll need to do this as well if you want to replicate my results. Drop me a line and I can help ya out on that.
I used the standard baserunning tables in the spreadsheet, and I did click the Update SBA button after constructing each lineup. I have no idea how much modifying those baserunning tables might affect the results.
I also just wanted to again thank John Beamer for publishing his Markov model with the Hardball Times Annual. If nothing else, it's provided a lot of food for thought!

Photo by Getty Images/Jonathan Daniel

Additional discussion about this project can be found at:

31 comments:

AnonymousSunday, April 13, 2008
Wow, I guess I should gloat a little bit, but when you think about it, .03 R/G is not much to gloat about, is it?
ReplyDelete
Replies
JoelSunday, April 13, 2008
I was in the bottom 5? Well, obviously your methodology is wrong. :)
ReplyDelete
Replies
jinazSunday, April 13, 2008
I wasn't sure if I should attach folks' names or not to the lineups. But I figured that most people would want to be able to easily find "their" lineup in the list, so kept 'em. But it does mean that I'm identifying folks as being among the bottom 5, which kind of sucks.. -j
ReplyDelete
Replies
JoelSunday, April 13, 2008
I honestly don't mind. I think the potential findings that you listed are interesting. It kind of makes sense that you don't want to bottleneck your lineup. Given the looping nature of a lineup, you probably help yourself more by having a mix of hitters that can extend innings and such.
ReplyDelete
Replies
UnknownSunday, April 13, 2008
Justin -- great work. And great to see someone putting the Markov to use.

It just goes to show how complex the interactions between different hitters can be.

It also goes to show that at the margins line-up doesn't make *that* much difference
ReplyDelete
Replies
AnonymousSunday, April 13, 2008
Justin,

I assume that the Markov model does not include baserunning?

Also, does the model include GDP's and the differences, for example, between a strike out and batted ball out (moving runners over, etc.)?

Does it move runners over differently on a batted ball out, depending on the handedness of the batter?

Finally, if it uses GDP, does it use a GDP projection for each player or does it assume the same GIDP rate for all players? If the latter, it it based on BIP for that player, and does it distinguish between RH and LH batters (the former hit into DP's more than twice the rate as the latter)?

Obviously these are all important variables for lineup construction, especially the base running and GIDP rates.

I have a sim that incorporates everything. I can run it on several lineups if you want and report back.

Also, one of the more important things about lineup construction, I think, is making sure you use an optimal lineup versus RH and LH starting pitchers. They can be very different. If you want to evaluate a manager's lineup, you have to look at his two versions. For example, let's say that he has a great overall (against all starters) lineup, but that he uses the same one against RHP as LHP. Well, it might be a bad lineups against each, even though it is great overall (unlikely, but possible).

Plus, splitting up lefties, even if it costs a few runs a year, is important I think, and should be taken into consideration.

MGL
ReplyDelete
Replies
AnonymousSunday, April 13, 2008
Are you using the Team Batting tab? I'm using the THT Markov spreadsheet, but the "number of innings led off" row doesn't change on mine. I'm curious how you're getting the different leadoff numbers for different lineups.
ReplyDelete
Replies
jinazSunday, April 13, 2008
@Anthony, that's a modification that I had to do with Beamer's help. Here's how I'm doing it:

First, in row 31, I created some lineup lookup numbers. So, C31 is 1, D31 is 2, E31 is 3, etc.

Then, in row 32, columns C-K, I pull the actual innings led off from the model. The code in C32 is "='Line-up start'!L61" (without double quotes). D32 is L62, E32 is L63, and so on.

Finally, I then use an Hlookup function to pull the correct innings led off from row 32. The code for C30 is: "=HLOOKUP(C29,$C$31:$K$32,2,FALSE)". D30 is the same, except that the lookup cell is D29 instead of C29.

I uploaded a screenshot of how it looks here. Let me know if this is unclear. -j
ReplyDelete
Replies
jinazSunday, April 13, 2008
@MGL,

John Beamer's the better person to ask about the inner workings of the model. I'm just plugging in numbers. But here's what I know:

I assume that the Markov model does not include baserunning?

The model does include situation-based baserunning. But aside from stolen bases I don't think it varies the probabilities of taking extra bases, pickoffs, etc, based on the given batter's skills. I *think* the best we can do in this model is make adjustments across all batters and see how that changes the results. I haven't messed with this at all.

Also, does the model include GDP's and the differences, for example, between a strike out and batted ball out (moving runners over, etc.)?

The model does request inputs of GDP's and strikeouts, so I'm pretty sure that it includes those in its calculations. And that it handles advancement on strikeouts differently than regular outs. Not positive about that, however.

Does it move runners over differently on a batted ball out, depending on the handedness of the batter?

There is a matrix that allows one to manipulate the frequency with which one takes a base on an out, and it does so for all base/out situations. However, the model knows absolutely nothing about the handedness of the batter.

Finally, if it uses GDP, does it use a GDP projection for each player or does it assume the same GIDP rate for all players? If the latter, it it based on BIP for that player, and does it distinguish between RH and LH batters (the former hit into DP's more than twice the rate as the latter)?

It does ask for input on GDP's, and the baserunning matrices have a specific category for what happens to baserunners during a double play.

However, because I'm using PECOTA projections, which do not include GDP projections, I estimated GDP's for all players based on 2007 averages. I did it based on BIP rates (i.e. GDP / [AB-K-HR]), not GDP/PA. I imagine that including information about speed would also be helpful, but I didn't do this.

I have a sim that incorporates everything. I can run it on several lineups if you want and report back.

If you'd like to run your sim on the Baker opening day lineup and the top 3 (or whatever) and bottom 3 opening day lineups according to this model, I think that'd be pretty interesting to compare to the results!

I also agree about the left vs. right lineups, and breaking up lefties to prevent late-inning left-handed relievers from neutralizing your lineup. However, this model doesn't have the capability of dealing with that yet (right John?). Still, the philosophy behind left & right lineups should be similar, no? It's just that you use different input data, i.e. data that recognizes left/right splits of players.

In other words, while this model might not allow us to identify The Best Real Life Lineup, it should help us understand what approach(es) to lineup construction are most successful given a set of input data.

Thanks for dropping by!
-j
ReplyDelete
Replies
AnonymousSunday, April 13, 2008
Brilliant! Thanks, Justin.
ReplyDelete
Replies
AnonymousSunday, April 13, 2008
Using my sim, I ran each lineup 100,000 times at home (neutral stats adjusted for home field advantage) in a neutral park against a neutral, league-average (neither RH nor LH) pitcher, using my projections for each player. My sim includes baserunning, GIDP, etc., so it is pretty much all encompassing. The standard deviation of runs per game for one team in 100,000 games is .009. So these numbers are plus or minus .018 runs at 2 sigma (with a 95% confidence interval).

Baker: 4.560
Baker with Votto rather than Hatteberg: 4.646 rpg +13.9
Your #1: 4.604 +7.1
Your #2: 4.618 +9.4
Your #3: 4.615 +8.9
Your worst: 4.626 +10.7
Your 3rd worst: 4.620 +9.7
Your 4th worst: 4.549 +1.8
Jinaz-OD: 4.588 +4.5
Jinaz-OD-exploit: 4.579 +3.1

I re-ran each lineup at home at GABP, rather than a neutral park:

Baker: 4.640
Baker with Votto rather than Hatteberg: 4.763 +19.9
Your #1: 4.706 +10.7
Your #2: 4.736 +15.6
Your #3: 4.758 +19.1
Your worst: 4.729 +14.4
Your 3rd worst: 4.731 +14.7
Your 4th worst: 4.672 +5.18
Jonaz-OD: 4.679 +6.3
Jinaz-OD-exploit: 4.690 +8.1

Let me say a couple of things: One, Dusty’s lineup is one of the worst you can put out there, as you can see from the above, based on my projections and my sim. You really have to make an effort to do as badly as Dusty.

I have much more confidence in a comprehensive sim than a “dry Markov chain.” In fact, I think that using a Markov chain that does not include handedness, baserunning, etc., is a waste of time for evaluating lineups.

Two, the Reds have a roughly average first-string lineup, despite what you often hear about them having a very good one or even a great one. And of course, the defense is awful, as long as Griff, Dunn, and Phillips are out there.

Three, can we stop saying that Griffey is a “great hitter.” He is not anymore. Not even close. He is a below-average hitting corner outfielder. With his terrible defense and baserunning, he is near replacement level. One of the worst overall players in baseball. Possibly the worst full-time player. Has been for a few years.

Four, Baker’s (or whoever makes those decisions) biggest mistake is playing Hatteberg over Votto. I don’t know about their defense, but Votto is almost a win and a half better with the bat than Hatty. If Hatty is a better defender, it probably is not more than a win, unless Votto is a DH-like entity, awful with the glove. And of course Hatty cannot run the bases a lick. I don’t know about Votto.

Five, the Reds lineup is quite balanced, as compared to many or even most, so that it does not make that much difference who you put where, as you can see from the above. As long as Keppy, Griffey, Dunn and Encarnacion are near the top or middle of the lineup, you are fine. And no one is that bad that they can’t pretty much bat anywhere, although Valentine being the worst and the slowest should probably bat last in any lineup.

Six, just eyeballing the above Zips projections, my projections are quite a bit different. I have, in a neutral setting, something like, in wOBA, Dunn, .386, Encarnacion, .368, Keppinger, .353, Griffey, .348, Hatty, .338, Phillips, .338, Patterson, .338, Valentine, .323.
ReplyDelete
Replies
AnonymousSunday, April 13, 2008
Try from opening day players:
1)Dunn
2)Keppinger
3)Griffey
4)Encarnacion
5)Hatterberg
6)Phillips
7)Patterson
8)Valentin
9)Pitcher
ReplyDelete
Replies
jinazSunday, April 13, 2008
Hi MGL,

Thanks for the quick work!

It is a bit unnerving to see how different your results are. You absolutely could be right about the importance of the handedness and baserunning details that your simulator takes into account. I do wonder how much your final point about the differences between your projections and the PECOTAs I used might also be coming into play though. Looks like the big differences were on Keppinger, Hatteberg, and especially Patterson. As you say, the Reds' lineup is fairly balanced, so differences like that could result in big differences in the rank order of lineups. Of course, the Patterson difference should just help Dusty's case, and he clearly got creamed in your sim.

As for your other points, I generally agree (though I still think that UZR must be missing low with Phillips given how he does with the Fans, PMR, and RZR...but we've had that conversation already!). A point I've made a few times is that if the Reds are going to contend, they're going to need surprises from both their offense and defense. And the only way they'll get surprises from their offense is if they play high-upside players like Jay Bruce and Joey Votto over known quantities like Patterson and Hatteberg.

FWIW, Votto does have pretty good speed for a first-baseman (perhaps average overall?), though reviews of his glove have been a bit mixed. I'm just assuming that he's an average defender for now. Hatteberg's been all over the place from year to year defensively, but I think he's at least not terrible.
-j
ReplyDelete
Replies
jinazMonday, April 14, 2008
@Anonymous,

I get 4.87 R/g, 788.5 runs/season, -0.6 runs above Baker per season.

-j
ReplyDelete
Replies
UnknownMonday, April 14, 2008
MGL

The model includes baserunning and GDPs. It also includes the the difference between a strike out and a batted ball out.

The GDP variable is calculated for each player individually based (principally) on the number of singles they have.

There is an implict, debatable assumptions here: that the ratio of GB/1B is roughly constant (no idea if this is true).

One thing I don't adjust for is left handed vs right handed batters. As you point out I probably should.

I tested the line-up part of the Markov extensively against real world data (especialy the PA per batting position) so I think it works fairly well (within the constraints of the Markov approach). Of course a fully fledged sim allows you take into account more variables but doesn't model the essence of the game, which is what the Markov does (not that that matters).

Justin -- if there is additional work you want me to do drop me a line. Also I am more than happy to make the Markov "source code" available to anyone who has bought the THT annual -- the only reason I didn't is because it is pretty complicated.

-Beamer

ps to do the lineup analysis correctly you have to iterate the PA per position. For similar line-ups this shouldn't make a difference but can be a source in instability ... you do this by copying and pasting the computed PA per position into the relevant part of the line-up spreadsheet.
ReplyDelete
Replies
jinazMonday, April 14, 2008
ps to do the lineup analysis correctly you have to iterate the PA per position. For similar line-ups this shouldn't make a difference but can be a source in instability ... you do this by copying and pasting the computed PA per position into the relevant part of the line-up spreadsheet.

This sounds like it could potentially make a difference, so I'd like to try it.

I'm having a hard time figuring out how to do this, though. I snooped around the common calculations page, but I'm not seeing it. Do I need the source code version to find the calculated PA? I'm also unclear about where I'd paste them.

Thanks,
Justin
ReplyDelete
Replies
UnknownMonday, April 14, 2008
Justin -- no you do it manually. Let me have a look when I have 5 mins today and drop you a line with detailed instructions.

By the way to compare this with MGL's sim we should absolutely use identical forecasts ...
ReplyDelete
Replies
AnonymousMonday, April 14, 2008
I don’t know how much difference it makes, but the sim uses all kinds of other variables, such as each player’s projected sb/cs rate, foul out rate, infield singles rate, roe rate, bunt rate (attempts and results), sacrifice rate (attempts and results), IBB rates, etc. Shouldn’t really make much difference in terms of batting order, I wouldn’t think.

Here are the projections it is using for each player, more or less. Singles are regular, infield, and bunt (sac and regular bunts) attempts combined. All are per 510 PA, where a PA is not including an IBB. These are park neutral and scaled to average NL offensive rates for 05-07, where the average wOBA for a non-pitcher is .340.

Dunn
S 51 d 22 t .9 hr 29 bb+hp 83 so 127 sb 4 cs 1 wOBA .382

Encarnacion
S 79 d 29 t 1.7 hr 17 bb+hp 50 so 75 sb 7 cs 2 wOBA .360

Keppinger
S 107 d 24 t 3.0 hr 6 bb+hp 38 so 33 sb 4 cs 2 wOBA .342

Griffey
S 68 d 22 t .7 hr 20 bb+hp 52 so 87 sb 4 cs 1 wOBA .335

Hatteberg
S 81 d 24 t .9 hr 8 bb+hp 61 so 46 sb 1 cs 1 wOBA .329

Phillips
S 82 d 22 t 3.2 hr 15 bb+hp 35 so 76 sb 14 cs 5 wOBA .324

Valentine
S 79 d 26 t .8 hr 12 bb+hp 37 so 68 sb 1 cs 1 wOBA .311

Patterson
S 84 d 25 t 4.1 hr 15 bb+hp 26 so 98 sb 25 cs 8 wOBA .311

As you can see, my projections are not very flattering - only 3 players are above average hitters and the entire lineup averages .337 in wOBA, 3 points below the NL average, which amounts to 15 runs worse than average per year. And that is with their best lineups (not including Votto and Bruce, I guess).

I don't have much optimism for the Reds, BTW. Pre-season I had them at 77 wins. As of yesterday, I had them at 78 wins. After today, it is 77.5, with a 1% chance of winning the pennant and 0% (rounded off to the nearest percent) chance of winning the WS. I do have them with a 7% chance of making the post-season. I have them as the 6th worst team in the NL (could be worse I guess) in front of only PIT, WAS, HOU, FLO, and SF. My season projections include some significant playing time for good players like Bailey, Votto, and Bruce. I actually like their pitching (3 wins over average). I have their staff as the 5th best in the NL, behind only ARI, LA, Mets, and SD, with MIL right behind them.

MGL
ReplyDelete
Replies
AnonymousMonday, April 14, 2008
Oh, and I was mixing up Phillips with someone else in my mind. He was +5 in 07 UZR and I have him with an average defensive projection overall.

The Reds still have one of the worst projected team defenses (-33 per 150), though, owing mostly to Griffey and Dunn, who are a combined 35 runs or so below average per 150 in projected UZR (-15 for Dunn and -20 for Griff).

I have Hatty with an average defensive (UZR) projection and below average (-2) in "scooping bad throws."

MGL
ReplyDelete
Replies
jinazMonday, April 14, 2008
MGL, thanks for those projections. I'll try to give them a run tonight in the Markov and see what happens. I may try Beamer's additional step of iterating the PA based on lineup spot as well if I can figure that out.

I really would think that these two approaches would result in fairly similar results, at least in a gross sense. If nothing else, they both should be vastly superior to the regression-based analyses we've seen. -j
ReplyDelete
Replies
XeifrankMonday, April 14, 2008
Justin, this is a nice excercise. Great job. I have a question, and I must admit that I haven't yet read the whole study (I will, I will). The Pinto tool for projecting runs per season, did you make any adjustments for "splits", ie - LHB vs LHP, LHB vs RHP? Does the Pinto model take into account speed/baserunning? If not, do you think it's important that you take into consideration more factors than OBP and SLG? Perhaps you'd see at the very minimum some adjustments to the run totals. Thanks!
vr, Xeifrank
ReplyDelete
Replies
XeifrankMonday, April 14, 2008
Ok, my questions were answered in reading the comments. I was thinking along the same lines as MGL, that a simulator would do a better job than a Markov Chain. I have a simulator too, but he beat me to it. :) Good work everyone.
vr, Xeifrank
ReplyDelete
Replies
jinazMonday, April 14, 2008
Xeifrank, the above results are not based on David Pinto's lineup tool. they use Markov chains, which is definitely a better tool for this purpose. The sim might be better, but I honestly wouldn't expect them to be *that* different given how much info the Markov uses. -j
ReplyDelete
Replies
XeifrankMonday, April 14, 2008
Justin, that's great that John put this tool together. Like he said you and MGL need to use the same input data. Keep up the great blog, your site is an every day must read. vr, Xei
ReplyDelete
Replies
UnknownMonday, April 14, 2008
Justin

The iteration step is straightforward. The PA per line up spot should "converge" all you need to do is copy the computed PA into the PA per position 2-3 times, update the SBA (push the button) and the rpg shouldn't change.

If you do get different results perhaps you can send me the exact input data you use and I will look into the Markov logic and try to piece if anything is falling down. I'm pretty sure everything is working as it should but as with all these things you never know. Then I can also adjust for other stuff like handedness etc. All those variables you can adjust for manually in the code but it is harder to automate in a mathematical model as you have to make assumptions that don't hold up to reality

-beamer
ReplyDelete
Replies
jinazTuesday, April 15, 2008
Important Note!!

John Beamer discovered an error in my use of his model tonight. I'll re-do all of the lineups using this modification, but I ran out of time to do it tonight. Initially, it looks like the major findings remain the same, and that Baker continues to do well...better, in fact.

MGL, I haven't given your projections a try yet, but will tomorrow night (hopefully).

@Anthony, my instructions to you weren't quite right. Forget the Hlookup. What you should do is just copy the entire row in your new row 32 up to row 30. Row 30 does not know anything about the lineup order that you manipulate in row 29. In other words, whatever is in C30 is always associated with the leadoff hitter.

John recommends doing this a few times, pushing the SBA button in between, until the values converge. I'm finding that just creating a direct link to row 32 works fine--no apparent problem with infinite loops, etc.
-j
ReplyDelete
Replies
jinazTuesday, April 15, 2008
Ok, I've run MGL's splits through the model. And I'm sure I'm now using the model correctly.

Those projections are much worse than the PECOTAs! But the rank order from this model has Baker actually coming out on top. Here are the results, listed in the same order you listed them:

Baker OD: 4.41 r/g, 715 r/sea

Old Top 3:
Bluzer OD: 4.32 r/g, 700 r/sea, -16 above Baker
Chris-OD2: 4.34 r/g, 703 r/sea, -12 above baker
Pickoff-OD: 4.36 r/g, 706 r/sea, -9 above Baker

Old Bottom 3 (bottom first):
redmanrick-OD: 4.26 r/g, 691 r/sea, -24 above Baker
Brad-OD2: 4.30 r/g, 697 r/sea, -18 above baker
fareast-OD: 4.33 r/g, 701 r/sea, -14 above baker

Mine:
jinaz-OD: 4.39 r/g, 711 r/sea, -4 above Baker
jinaz-OD-exploit: 4.33 r/g, 701 r/sea, -14 above baker

I'm using 2007 #9-slot hitting totals now, which come to a 0.170/0.216/0.250 hitting line. Before I was using pitchers only, but this helps account for the late-inning pinch hitters.

So...Still disagreements between the two systems. These come out much lower than yours (using the same projections, mostly), and with a different rank order. And the range is a bit higher as well, with ~24 runs between the worst and best in the Markov and ~11 runs between the best and worst in your sim.

The rank order differences are just bizarre though. Baker's lineup comes out on top in this Markov, but is doing terribly in your sim. In fact, the worst lineup according to this Markov is the best in your sim...it's almost like they're inverted!

????
-Justin
ReplyDelete
Replies
Chris at Redleg NationTuesday, April 15, 2008
Fun stuff, Justin. I thought you were accounting for platoon splits, for some reason - hence my two lineups for everything.

The "bottleneck" theory is a very interesting one. Baker is sailing into the wind here, but some of the typical nonsense moves we might expect (Juan Castro batting second; Phillips' crappy OBP in the cleanup spot) may actually be beneficial.
ReplyDelete
Replies
AnonymousTuesday, April 15, 2008
Justin,
That's amazing that the two systems are inverted on the top and bottom lineups. Perhaps, i should try running the two on my simulator and see which one my simulator trends towards. If you'd like me to try that, I could but I format my hitters input data a little differently. I would need to know OBP, AB, 1B, 2B, 3B, HR, BB, K for the hitters.
vr, Xeifrank
ReplyDelete
Replies
jinazTuesday, April 15, 2008
Xeifrank, the more the merrier! Tango is also going to run MGL's projections on these lineups and see what his simple markov comes up with.

As for the data, you should be able to calculate all of that from MGL's posted projections. But I can also look them up for you when I get a sec. -j
ReplyDelete
Replies
TangotigerWednesday, April 16, 2008
Hmmm... looks like the post was lost.

Using the PECOTA forecasts noted in the main blog entry, and ignoring batting handedness, speed, and GIDP (all things I would not normally ignore), with wOBA in parens:

1. Hatty (.358)
2. Dunn (.401)
3. Junior (.360)
4. Encarnacion (.367)
5. Keppinger (.348)
6. Phillips (.334)
7. Valentin (.333)
8. Pitcher (.160)
9. Patterson (.311)

I don’t see how it’s possible to have such a big disagreement here. Other than Dunn and Patterson, the PECOTA forecasts are very tight for the rest of the players.

Here’s a reasonable lineup (5 best players remain in the top 5, don’t touch the pitcher), but in the worst possible combination:
1. Encarnacion (.367)
2. Junior (.360)
3. Dunn (.401)
4. Keppinger (.348)
5. Hatty (.358)

6. Patterson (.311)
7. Phillips (.334)
8. Pitcher (.160)
9. Valentin (.333)

And that’s just 3 runs worse than the optimal one that my Linear Weights model would suggest.

Basically, it’s pretty difficult to create a bad lineup (notwithstanding the exceptions I noted at the start of this post).
ReplyDelete
Replies

Add comment

Table of Contents

Sunday, April 13, 2008

Markov: Dusty Baker's lineups aren't half bad

31 comments: