I am going to start this article with data supporting why, under no circumstances, you should ever bet an MLB game to go to extra innings before the game begins. Then, I am going to take you down the wrath of statistics and machine learning as we become the most knowledgeable extra-inning bettors in the market. By the end of this article, my hope is that you, as analytical gamblers, will be equipped with extra-inning knowledge second to none, and your sons and daughters will live to hear of the day when that knowledge began.

Exploratory Analysis
Let's get the bad out of the way. Since 2017, extra-inning games occur about once every 11.96 games (8.36%). Each year is different, but 2022 is hovering a little above the average at once every 11.73 games (8.52%). With that in mind, without any vegas juice, an implied extra innings percentage of 8.36% should yield an average bet of +1096. However, as I am writing this article on the eve of August 5th, the average price for the 14 games to go to extra innings tomorrow is a measly +717 which implies odds of 12.24%. The vig is something that is a part of every bet as it is what keeps Vegas in business. Nonetheless, with DraftKings assuming an implied odds of a game going to extra innings 46% higher than what we have seen over the last five years, there should more analysis before hitting Place Bet.
I am going to start this article with data supporting why, under no circumstances, you should ever bet an MLB game to go to extra innings before the game begins. Then, I am going to take you down the wrath of statistics and machine learning as we become the most knowledgeable extra-inning bettors in the market. By the end of this article, my hope is that you, as analytical gamblers, will be equipped with extra-inning knowledge second to none, and your sons and daughters will live to hear of the day when that knowledge began.

Exploratory Analysis
Let's get the bad out of the way. Since 2017, extra-inning games occur about once every 11.96 games (8.36%). Each year is different, but 2022 is hovering a little above the average at once every 11.73 games (8.52%). With that in mind, without any vegas juice, an implied extra innings percentage of 8.36% should yield an average bet of +1096. However, as I am writing this article on the eve of August 5th, the average price for the 14 games to go to extra innings tomorrow is a measly +717 which implies odds of 12.24%. The vig is something that is a part of every bet as it is what keeps Vegas in business. Nonetheless, with DraftKings assuming an implied odds of a game going to extra innings 46% higher than what we have seen over the last five years, there should more analysis before hitting Place Bet.
All hope should not be lost. Let's use the historical data to find any overarching trends for extra-inning games over the last five years. I removed the first 20 days of each MLB season to allow for regression towards the mean for specific metrics. This was especially important for creating the machine learning model as many of the metrics we use, such as OPS, can be highly inflated or deflated at the beginning of the season.
The first hypothesis surrounds divisional games. It is my belief they are more likely to go to extras compared to non-divisional games. The thought process is that the unrevealed advantages each team has against teams they don't play as often are diminished. When looking into this claim, though, what we see is the opposite. Divisional games go 10+ innings 7.99% of the time while non-divisional games see extra innings 8.62% of the time.
On the same note, I wonder if specific divisions go to extra innings more commonly than other divisions. To do this, I specifically looked at who the home team was, and used them as the divisional marker. Over the last five years, the AL Central home teams have seen the lowest percentage of games go to extras by a large margin. They sit at a 6.48% extra inning rate while the other divisions all sit above 8.34%. The leading division over the last five years is the AL West at an 8.82% clip.
What about 2022 specifically? The analysis changes slightly as the AL West now holds, by far, the lowest extra inning rate of the bunch stationed below 6.5%. Surprisingly, the AL East, NL Central and NL East all hold extra inning percentages above 10% this year (after we omitted the first 20 days of the season for regularization purposes) with the AL East at an astounding 10.86% extra inning rate.
Each percentage is highly dependent on the five teams within each division. So which teams are the highs and which teams are the lows when it comes to extra inning rate? The 2022 extra inning top teams are the Chicago Cubs (17.65%), the Tampa Bay Rays (13.79%) and the San Diego Padres (13.48%). On the flip, the bottom teams consist of the Kansas City Royals (4.44%), the Houston Astros (3.37%) and finally the Oakland Athletics at a wimpy 2.27%.
Digging even deeper, some interesting findings include a 20.45% away extra inning rate for the Tampa Bay Rays. A 17.02% extra inning rate at home for the Chicago White Sox, followed by only a scanty 4.88% extra inning rate on the road. Quick back of the brain note — games played in Chicago this year have gone to extras 16.85% of the time. Back on track, the last interesting finding comprises the Yankees’ home and away splits. Seven times they have gone to extras at home (15.55%), but only once on the road in 2022 (2.72%).
Help from Machine Learning
Although engaging, as promised in the attention-grabbing title of this article, exploratory analysis isn't alone going to get us to the promised land. Let's move to machine learning to do the heavy lifting.
Before we hop into the model, let's walk through the data. First, I utilized the baseballr package in R created by Bill Petti to accumulate game-by-game standard team metrics such as batting average, OPS, HR percentage, SO percentage and BB percentage.
The next batch of data I acquired by scraping the public site Baseball-Reference. Looping through each game since 2017, we can acquire the time of day, attendance and the overall box score for each game. Joining this data with the data collected from the baseballr package, we have ourselves a solid pool of data. All data before 2022 was the training data and the 2022 season-to-date was used as the testing data.
Now, we have every game since 2017 with the team stats up-to-date for each of the years. For example, if we take the June 11th, 2021 game between the St. Louis Cardinals and Chicago Cubs, we have obtained the year-to-date batting average, OPS, HR percentage, SO percentage and BB percentage for both teams as well as the knowing the game was a day game, the attendance was 35,112 and the game did not go to extra innings. What we can do now is find the differences between the continuous metrics, for example, home OPS minus away OPS. Those differences will be the predictor variables in the model. Lastly, I hard-coded the division of each team into the dataset which allowed for the creation of a factor variable on if the game was a divisional game.
One of the goals of the model was to obtain not only the predicted outcome of extra innings or not, but also the probabilities of each game going to extra innings. We want to limit the number of games that we do bet, but we may not want to limit our predictions to all of the "yes" predictions. Therefore, logistic regression was the choice for the model selection.
Using the bestglm package in R, with a selection criterion of AIC, we get the best logistic regression model that utilizes these four variables:
- X1 = Batting average difference
- X2 = OPS difference
- X3 = SO percentage difference
- X4 = 1 if divisional game, 0 if non-divisional game
Interestingly enough, all four of the predictor variables were statistically significant at the 95% confidence level.
Predictions
We have our model, now let's use it to predict across the test data. Because the extra-inning games happen at an 8.52% rate, I am going to use the top 8.52% of the predicted probabilities as the "yes" predictions. By doing this, we get an overall error rate of 14.79%. This is pretty good, but what we really care about is the error rate on the "yes" predictions as those will be the only games that we theoretically bet. In this case, the model predicted 111 extra-inning games to occur so far in 2022 and was correct in 18 of those predictions. That's a correct prediction rate of 16.22%, which is higher than the percentage needed to take money from vegas. Assuming we had bet $100 on each of the predicted "yes" outcomes and the average line was +717 like we saw before, we would have netted $1806 so far this season. I can certainly live with that.
There is only more to learn as more data is collected and more sources are joined in. Knowing everything that we know now, we can easily combine the exploratory analysis that has shown trends over the last five years with the uncovered analysis that machine learning has provided to make the most educated extra inning bets of anyone in the market.
If you enjoyed this article, make sure to come back and check out the future extra inning predictions.
Join the BettingPros Discord Chat for Live Betting Advice >>
Subscribe: Apple Podcasts | Spotify | Google Podcasts | Stitcher | RadioPublic | Breaker | Castbox | Pocket Casts
Whether you're new to fantasy football or a seasoned pro, our Fantasy Football 101: Strategy Tips & Advice page is for you. You can start with our Beginner's Guide to Daily Fantasy Football or head to a more advanced strategy - like A Guide to Orchestrating Early Season Trades - to learn more.