Alright, let's talk about this. Last summer, we ran what we're pretty confident was the first-ever head-to-head AI handicapping contest. Three of the most advanced AI models on the planet, same exact prompt, same daily MLB board, completely transparent tracking. It was supposed to be an experiment. It turned into the most talked-about thing on this site all year. So we're doing it again. And honestly? This year is going to blow last year out of the water.
Here's what you need to know: the 2026 AI Handicapping Contest officially kicks off on Opening Day. We're expanding the field from three to four. We've got new rules coming. And the defending champion has a massive target on its back. Let's get into it.
Last Year's Contest Was Absolutely Wild
If you weren't following along last summer, here's what happened. From July 29 through August 21, 2025, we fed Claude, ChatGPT, and Gemini the exact same information every single day. Same matchups, same pitching probables, same betting lines, same statistical context. Each model analyzed the board and gave us its best picks with full reasoning. We tracked everything, published everything before first pitch, and graded every single pick against the closing line.
What we got back was genuinely surprising. Not just the results themselves, but how differently each model approached the same information. Claude was surgical. ChatGPT was a machine gun. Gemini was the cautious accountant. Three completely different strategies, three completely different outcomes.
The 2025 Final Standings
2025 AI Handicapping Contest , Final Results (July 29 - Aug 21)
| Model | Record | Pushes | Win % | Units | Wagered | ROI |
|---|---|---|---|---|---|---|
| 79-49 | 3 | 60.7% | +46.53 | 296.30 | +15.70% | |
| 89-62 | 2 | 59.6% | +11.45 | 288.02 | +3.98% | |
| 70-56 | 2 | 55.9% | -1.81 | 246.51 | -0.73% |
2025 AI Contest Profit Leaderboard
Look at those numbers for a second. Claude crushed it. A 60.7% win rate and +15.70% ROI across 296 units wagered is legitimately elite. Most professional sharps would kill for a sustained 5% ROI over a full season. Claude nearly tripled that in 24 days. That's not a hot streak, that's a model finding edges the market didn't price correctly.
ChatGPT was interesting in its own right. It was the most active model, firing off 153 total decisions compared to Claude's 131 and Gemini's 128. It still turned a profit at +11.45 units, which is respectable, but the volume-over-precision approach cost it. When you swing at more pitches, you're going to make more contact, but you're also going to chase more balls in the dirt.
And then there's Gemini. Here's the thing about Gemini's 2025 that makes it so fascinating: it won 55.9% of its picks. That's a winning record by any definition. But it still finished in the red at -1.81 units because of the vig. This is the brutal math of sports betting. Winning isn't enough. You have to win at the right price, at the right rate, with the right sizing. Gemini proved it could find winners but couldn't beat the juice. That's going to haunt Google's model heading into 2026.
The Four Contestants
This year's field has expanded. Three returning competitors with scores to settle, and one brand new challenger that nobody has tested in a handicapping arena. Let's break down the field.
Claude
ChatGPT
Gemini
Grok
How the Contest Works
The AI Contest Pipeline
The entire concept is built on one core principle: fairness. Every morning during the MLB season, each AI model receives the exact same prompt. No special instructions. No tweaked inputs. No thumb on the scale for anyone. Same data, same context, same question. The only thing that differs is how each model's brain processes the information.
What you end up with is four fundamentally different AI architectures, built by four different companies with four different philosophies about artificial intelligence, all looking at the same data and reaching their own conclusions. Claude's Constitutional AI reasoning brings a measured, analytical approach. ChatGPT's transformer engine brings raw processing power and creative pattern matching. Gemini's multimodal design brings Google's vast data infrastructure. And Grok's real-time X integration brings something nobody has tested in this space before: live public sentiment data feeding directly into the analysis.
Every pick is published before the games start. Every result is tracked. Every penny of profit or loss is documented and public. There's no place to hide in this contest, and that's by design.
What's Changing for 2026
Last year's 24-day run was a proof of concept. It proved that this works, that people love following it, and that these models can actually produce actionable insights in the sports betting space. So we're going big for Season 2.
We're starting on Opening Day. No more waiting until the middle of the summer. When the first pitch of the 2026 MLB season is thrown, all four models will have their picks locked and loaded. We want a full-season sample size, and that means starting from day one.
The field expands to four. Grok is the biggest addition. Adding a model with real-time social media data integration is going to introduce a completely new dynamic. Will Grok's X pipeline give it breaking news advantages? Will it get burned by chasing viral narratives? We genuinely don't know, and that uncertainty is what makes this compelling.
New rules are on the way. We spent the offseason analyzing every angle of the 2025 data. What worked, what didn't, where the scoring system could be tighter, how to make the daily tracking even more transparent. We're finalizing a refined ruleset for 2026 that addresses everything we learned. The new rules will be published soon, so keep checking back.
New Rules Dropping Soon
We're putting the finishing touches on the official 2026 ruleset. Updated scoring. Tighter transparency standards. New ways to follow along daily. This contest is about to get even more competitive.
Stay tuned for the full rules announcement.
Why You Should Care About This
Let's be real for a second. On the surface, this is four AI models picking baseball games. But underneath, it's one of the most practical tests of artificial intelligence reasoning that exists right now. Sports betting is a domain with clear, binary outcomes: you're right or you're wrong. The scoreboard doesn't grade on a curve, and the closing line doesn't care about your reasoning if you picked the wrong side.
That makes it the perfect real-world benchmark. Can these models genuinely process complex, probabilistic information and produce insights that beat the market? Last year's results suggest at least one of them can. The 2026 contest will tell us if that edge is real, if it's sustainable, and which approach to AI reasoning produces the best outcomes when real money is on the line.
Whether you're a sports bettor looking for an edge, an AI enthusiast who wants to see these models tested outside of chatbot conversations, or someone who just enjoys watching a good competition unfold in real time, this contest has something for you.
The Stage is Set
Claude enters as the champion everyone is chasing, with +46.53 units of proof that it can find edges in the MLB market. ChatGPT enters hungry, knowing its +11.45 units could have been so much more with better discipline. Gemini enters with something to prove after winning more picks than it lost but somehow finishing in the hole. And Grok? Grok enters as the wildcard that could blow this whole thing wide open or crash and burn spectacularly. Either way, it's going to be entertaining.
Four AI models. One MLB season. Picks published every single day before first pitch. Full transparency, full accountability, and one champion at the end of it all.
The 2026 AI Handicapping Contest starts Opening Day. Check back soon for the official rules, and get ready for what's going to be one hell of a ride.