So, the Cubs have won the World Series in 2016. The last time they won was in 1908 when the American League and the National League had eight teams each. There are a lot of good math questions we can ask with respect to a waiting time as long as the Cubs have had. So, here are a few math prompts with the following assumptions.
Pretend that baseball begins in 1909. Thus, nobody has yet won a World Series. Assume that each team has an equal likelihood of winning the World Series each year [not true, of course, but this is a ‘necessary’ simplification]. Also, make sure to follow a team through its franchise changes. For example, in 1909, there was a team called the New York Highlanders; they became the New York Yankees in 1913. The Brooklyn Superbas became the Brooklyn Dodgers and then eventually today’s Los Angeles Dodgers.
- Pick a team that existed in 1909 and follow them through to 2016 accounting for changes in franchise. What’s the probability that the chosen team will win their first World Series in 2016? Remember to account for the fact that as the years go by, the total number of teams playing between the American League and the National League changes.
- Given all the teams that existed in 1909, what is the probability that all of those teams would have won the World Series at least once by 2016?
- If baseball started in 2017 with the current roster of teams, how long would it take for all teams to win the World Series at least once, assuming equal likelihood each year?
- If baseball started in 2017 with the current roster of teams, how long would it take for the Cubs to win again?
There are a few routes that you (the teacher) can take with these exercises.
- If you are teaching a Statistics course and your students have a little programming background, then this is a great way to introduce to Monte Carlo methods and simulation. Each of the above exercises are geared for simulation and require the student to build an empirical probability distribution. From there we can answer many other questions. For example, with an empirical probability distribution, we can answer “What is the probability that all teams that existed in 1909 will have won the World Series at least once between the years 1945 to 1955 [or any other year interval]?” And since they have a bit of a programming background, for a real challenge they can try to account for each team’s likelihood of winning the world series based the team’s previous year’s win/loss record as a proxy. Be warned: that’s a lot of work and not obvious how the odds would be calculated [it can be done, but it is really tedious and we’d have to develop a pari-mutuel betting system to get risk-neutral odds — this is more suitable for Master’s degree seeking students].
- If your students don’t have a programming background, but they are taking a Statistics course, then pick one of the prompts above and provide simulated results. Shoot me an email or a tweet and I can provide simulated results. With the simulated results have students build an empirical probability distribution and answer the prompted questions. Deeper inquiry could be that students answer their own questions about the likelihood of certain events.
- If your students are neither taking a Statistics course nor have a programming background, there’s still stuff to do! Again, shoot me a message for simulated data, and with it, you [the teacher] can construct the probability distributions and their associated graphs. This is a great way to get students to read graphs and interpret the results.
Wikipedia’s MLB page gives inception dates for teams, which can help to track franchising history.
The Baseball Almanac gives a yearly breakdown, team by team. This is a rich source of information.