Sunday, December 16, 2007

More Burton vs. Rivera Comparisons

Quantifying player similarities--and thus, perhaps, gaining insight into why players vary in performance, and how they might perform in the future--is an approach that dates back (at least) to Bill James. His formulas for finding similar players are actively used at Baseball-Reference. Similar work on similarities is the basis for PECOTA, which is probably the best publicly-available projection system.

Joe P. Sheehan has a neat article out yesterday that takes the next step in comparing players. He calculates similarity scores on individual pitches using pitchf/x, allowing him to find pitchers that are similar in the actual pitches they actually hurl at batters. The similarity scores know nothing of effectiveness--they just use velocity and break (both x and y) from the pitchf/x data.

One of the pitchers that Sheehan investigated was Mariano Rivera and his cutter. Here is the list of pitchers who had a comparable cutter in MLB last season, according to Sheehan's system
Name Pitch Throws MPH pfx_x pfx_z Score
Mariano Rivera FB R 93.4 2.72" 7.72 100*
Jared Burton FB R 93.4 1.57" 7.58 98*
Brandon Medders SL R 91.2 2.27" 9.40 95
Juan Salas FB R 90.9 1.02" 8.05 95*
Jon Lester FB L 92.1 4.50" 9.56 95
Jason Isringhausen CT R 90.3 1.69" 7.92 95
Randy Flores FB L 90.0 1.79" 7.41 95
Jonathan Broxton CT R 96.3 1.03" 8.40 94
Brian Wolfe CT R 92.6 -0.39" 6.97 94
Kevin Cameron FB R 91.9 -0.11" 6.64 94
Yep, that's Jared Burton, coming out well above everyone else in the similarity scores. This is consistent with some previous work that John Walsh did for us in September, which also showed qualitative similarities between Burton and Rivera. It's pretty neat to see just how similar they are, relative to other pitchers in baseball.

Now, as Sheehan makes clear, Rivera's pitch moves a good inch horizontally more than Burton's. And small sample sizes could be a factor here as well. But I think it's darn interesting that we keep seeing Rivera pop up as a comparable pitcher to Burton. As I've said before, it's very unlikely that Burton will come anywhere close to having something like Rivera's career. But having a pitch that is similar, at least in some ways, to Rivera's can't hurt either...

On a broader note, this line of work has tremendous potential in a variety of fields, perhaps most significantly in our ability to identify player similarities. As PECOTA has shown, comparisons to similar players is an extremely effective way to predict future player performance. Using quantitative "scouting" data like pitchf/x should eventually allow us to greatly improve our ability to identify similar pitchers, and thus predict their performance.


  1. I'm very excited about this kind of analysis and have been waiting for it for some time. I've often wondered why we didn't run analysis on players with similar scouting profiles on the 20-80 scale. There's really no reason, technically, why we can't bridge the scout/stat gap and this seems a great place to start.

  2. Tom Tango has often said that the pitchf/x stuff really is where scouting and stats meet. It's going to be extremely exciting to see what happens with this kind of thing over the coming year or two. Projections seem like one of the big places where these kinds of data can make an enormous positive difference. -j