Isn’t the binomial distribution amazing?

]]>If you have only a cursory knowledge of baseball, you know that the very best teams each year only win about 60% of their games. The worst teams still win about 40%. In fact, incredibly bad, historically bad, teams, still win 30% of their games.

Just taking the outer boundaries, we can say that in a two game series, the good team will win both 36% of the time, and the bad team will win both 16% of the time, with the same team winning both games 52% of the time. And we’d probably expect the reality to be a bit lower, since most games are between fairly evenly matched teams (where the figure would be 50%).

The only way anyone might suggest that the number could be above 60% is if they believe that wins and losses tend to group together in streaks (i.e. momentum), but it’s a baseball truism that “momentum is tomorrow’s starting pitcher.”

I assume that the question was not posed very accurately to the colleagues. If the question was, when a team wins a game, how often do they win the subsequent identical game, then the responses make more sense (common sense dictates that teams still only win 2/3 or so of the rematches). But the analysis itself does not in fact select identical games, it selects consecutive games with different pitchers. Basically, it doesn’t isolate a unique scenario at all, the numbers generated are simply the obvious results given the range of winning percentages among major league teams. Without any data analysis, you can predict that the percentage must be between 50 and 52%. And the graph comparing margin of victory, seems to simply be identical to the graph of run differential in any sample of games. Past performance is in fact a pretty good indicator in this instance of future performance, but there is a ton of information you could analyze to accurately assess a team’s past performance. Whether the team won or not the previous day is a very poor measure. It would be like going to the race track, opening up the daily racing form to look at the past performances, and the form simply telling you what place the horse finished in it’s last race. Ha!

]]>In the first game, the Phillies had Kyle Kendrick http://www.baseball-reference.com/players/k/kendrky01.shtml who is an average pitcher.

In the second game, the Phillies started Roy Halladay, perhaps the best pitcher in baseball. http://www.baseball-reference.com/players/h/hallaro01.shtml

Perhaps you could analyze how the Phillies do against the mets with Halladay and Kendrick separately.

]]>“The large reversal there can probably be attributed to taking excess risk to create offensive opportunities to comeback from a deficit at the expense of giving up some odd man breaks.”

It could be, but we’d need to check whether the rarity of hockey reversals of that magnitude could be due to chance alone.

]]>I would be interested in seeing a similar analysis in other sports, although it may not translate well since baseball is the only sport I can think of where you have a series of games in the same location. It happens in the playoffs in hockey and basketball but by that time you are probably talking about fairly evenly matched teams so you may see similar results. For instance, the Blackhawks last game one to Vancouver 5-1 and won game two 4-2. The large reversal there can probably be attributed to taking excess risk to create offensive opportunities to comeback from a deficit at the expense of giving up some odd man breaks.

]]>I wonder if you do some sort of ACF plot, you would see a spike everytime the ‘good’ pitcher rotated back around.

]]>