Can math predict fist fights?

11 May, 2026

Intro

If you ask most MMA fans my age how they got into the sport, they'll respond by recounting the rise of Conor McGregor, or crowding around a dorm room TV watching fights back in 2019 during the UFC's peak in popularity. I, on the other hand, discovered my love for MMA while I was sick with a fever in a Greek cave.

Up until late 2023, my knowledge of MMA was mostly ambient. I knew the UFC existed, I knew who Conor McGregor was, and I knew Joe Rogan would yell into the mic cageside, but I'd never seriously followed the sport. In fact, I had no interest in basically any sport.

But that changed in November 2023 during a family vacation in Santorini.

A few days earlier, while passing through Athens, I somehow managed to get a fever (just my luck to get sick on the vacations I rarely take). So, while my family went out in the pearly white, cat-filled streets of Oia to find some food, I stayed behind in our Airbnb, a traditional Greek cave house built into the side of a cliff. It sounds kinda nice, but when you're sick in bed and all alone, it really does just feel like being in an empty cave.

As I was waiting for my family, I decided to scroll through YouTube Shorts. Eventually, after enough scrolling, I somehow came across a short from the UFC. Specifically, it was a clip of Islam Makhachev landing an absolutely devastating head kick on Alexander Volkanovski in their rematch. I watched the shin connect, the follow-up hammer fist strikes, and Volkanovski drop to the ground like a fly. And then I watched it three more times. Then I kept swiping, watching more clips.

Makhachev

By the time my family came back, I'd fallen deep into the MMA rabbit hole. In the span of a few hours, I'd binged old fight highlights, knockouts, grappling exchanges, press conference clips, fighter breakdowns, and style analysis videos.

I was never a sports fan before, but that day, I became one.

Calculated Chaos

At first, the appeal of MMA was purely visceral. After all, MMA is a very dramatic sport. It's a spectacle filled with knockouts, deep submissions, bitter rivalries, walkouts, trash talk, upsets, and improbable comebacks. However, as I watched the sport more and more, the fights started to look less like chaos and more like a game of strategy.

A fight isn't just about two people trying to hurt each other. I mean, it is, but there are so many variables to take into account. Wrestling, boxing, kickboxing, jiu-jitsu, cardio, durability, psychology, age, height, reach, stance, pace, coaching, and game planning. Every fighter brings a unique set of skills and liabilities into the cage.

A matchup might look obvious beforehand, but once the fight actually takes place, the outcome turns out to be completely different due to one failed takedown or one desperate punch. Some fighters appear to be dominant in general, but have very specific bad matchups.

Fights rarely felt like a simple question of, "Who is better?" They usually felt more like, "Whose strengths actually matter against this specific opponent?" Over time, this question pulled me from just watching MMA as a fan to thinking about MMA as a modeling problem.

Over the last few weeks at the Recurse Center, I've been working on a project built around one central question.

Can math predict fist fights?

It's a somewhat silly question, but it sits at the intersection of several things that I've become increasingly interested in over the past few years. Things such as quantitative modeling, prediction markets, sports betting, machine learning, and, of course, combat sports.

The more I learned about quantitative trading and prediction markets, the more I started seeing fights differently. UFC cards (and really any sporting event) are also probability markets.

Before a bout happens, sportsbooks and exchanges will imply probabilities. A fighter isn't just a favorite or an underdog. They're trading at a price, and that price encodes a belief about the future outcome of the bout. If a fighter is priced at a 65% implied probability of winning, but you suspect that the fighter's true probability of winning is 75%, the question you're asking isn't just about who's going to win. Rather, you're asking yourself whether or not the market is wrong. This was absolutely fascinating, and I wanted to dedicate some time at RC to build a fully functioning MMA prediction and betting model.

When I started doing research to build a serious model, I came to realize that institutional betting on MMA and other combat sports essentially doesn't exist. So, I have to make a robust model from scratch. But before building a serious model, I had to answer a more basic question:

Why is MMA so hard to model in the first place?

Sports Betting vs. Prediction Markets

Before getting into the modeling problem, I think it's worth explaining why I find prediction markets more interesting than traditional sportsbooks.

This became even more interesting in 2025, when TKO Group Holdings announced that Polymarket would become the official and exclusive prediction market partner of the UFC and Zuffa Boxing. Around the same broader media shift, the UFC also announced that beginning in 2026, its events would move away from ESPN+ and the pay-per-view model. Instead, they struck a $7.7 billion deal to stream their events exclusively on Paramount+ under a new seven-year media rights deal. Combat sports, media distribution, and prediction markets were all suddenly converging.

Paramount

Traditional sportsbooks may look like markets from the outside because prices/odds move and bettors express opinions through wagers. Structurally, however, they aren't neutral exchanges. The sportsbook sets lines, controls limits, and manages customer risk. Most importantly, they're the ones who decide how much action it's willing to accept from participants.

If you're consistently profitable as a bettor on a sportsbook, there is a 99% chance that you're not going to be treated like a valued participant in the market. Instead, the sportsbook may limit or restrict your bets to an obscenely small amount (like $2 max bet), and you may even be banned from making bets if you're too good at spotting mispriced odds. It quite literally operates like a casino. The losing players are kept in constant rotation to make bets and lose money, and the smart bettors who recognize arbitrage are kicked out like card counters.

Prediction markets, however, are much closer to actual financial markets.

In a binary prediction market, a contract pays $1 if an event happens and $0 if it doesn't. If a fighter's "Yes" contract trades at $0.62, the market is roughly saying that the fighter has a 62% chance to win the fight, ignoring any fees or other microstructure details.

Mathematically, this is so wonderfully simple:

contract price \approx ℙ (event happens)

If a contract trades at price $P_{m a r k e t}$ , then the market-implied probability is approximately:

p_{m a r k e t} = P_{m a r k e t}

From there, I can build a prediction model that can estimate its own probability:

p_{m o d e l} = P_{m o d e l} (fighter wins)

The simplest definition of edge is:

edge = p_{m o d e l} - p_{m a r k e t}

For a contract that costs $c$ and pays $1 if correct, the expected value of buying one option is:

E V = p_{m o d e l} (1 - c) - (1 - p_{m o d e l}) c

Here, the first term is the probability of winning multiplied by the profit if the contract resolves to $1. The second term is the probability of losing multiplied by the cost of the contract. This simplifies to:

E V = p_{m o d e l} - c

So, if my model says that an event has a 70% chance of happening and I can buy the "Yes" contract for $0.60, then:

E V = 0.70 - 0.60 = 0.10

In expectation, that's 10 cents of value per share before fees, potential slippage, liquidity constraints, and any model error. This allows you to reframe probability estimation into a tradable object. And effectively, it turns prediction markets from gambling into market making under uncertainty. Now, the important question is determining whether your potential profit is greater that your margin of safety:

p_{m o d e l} - p_{m a r k e t} > required margin of safety

Creating a prediction model is no simple task, but it's important to keep in mind that the model does not need to be perfect. However, it does have to be calibrated, disciplined, and better than the market in specific situations.

The Illusion of Sportsbook Odds

Sportsbooks also use probabilities in the form of traditional betting odds. For decimal odds, which are primarily used in Europe, Australia, New Zealand, and other international casinos, the implied probability calculation is straightforward:

p_{i m p l i e d} = \frac{1}{decimal odds}

So, decimal odds of 2.00 imply:

p_{i m p l i e d} = \frac{1}{2.00} = 0.50

American odds, which are more commonly used in, well, America, are a little annoying. There are two types of American odds, those being negative odds and positive odds.

Negative American odds imply a betting favorite. For negative American odds, such as -150, the way you interpret those odds is as follows:

"You have to wager $150 on this event in order to profit $100 if it happens."

The implied probability for this event is:

p_{i m p l i e d} = \frac{| odds |}{| odds | + 100}

So:

p_{i m p l i e d} = \frac{150}{150 + 100} = 0.60

Positive American odds imply an underdog. For positive American odds, such as +120, the way you interpret those odds is as follows:

"If you wager $100 on this event, and it takes place, then you will profit $120."

The implied probability for this event is:

p_{i m p l i e d} = \frac{100}{odds + 100}

So:

p_{i m p l i e d} = \frac{100}{120 + 100} \approx 0.455

It's important to note, however, that sportsbook odds contain a hidden tax known as vig. Suppose a sportsbook says that Fighter A is a -150 betting favorite, and Fighter B is a +120 underdog. The raw implied probabilities are:

p_{A} = 0.600

p_{B} = 0.455

Together, these odds sum to:

p_{A} + p_{B} = 0.600 + 0.455 = 1.055

That extra 5.5% is the sportsbook's overround, where:

overround = \sum_{i} p_{i m p l i e d, i} - 1

In this example:

overround = 1.055 - 1 = 0.055

So, the book isn't implying a clean 100% probability distribution. It's implying 105.5%, with that extra margin representing the sportsbook's edge. That edge is how the sportsbook guarantees it makes money regardless of which side a person bets on.

A simple way to remove the vig from sportsbook odds is to normalize each side by the total implied probability:

p_{n o v i g, i} = \frac{p_{i m p l i e d, i}}{\sum_{j} p_{i m p l i e d, j}}

So, for Fighter A:

p_{A, n o v i g} = \frac{0.600}{1.055} \approx 0.569

And for Fighter B:

p_{B, n o v i g} = \frac{0.455}{1.055} \approx 0.431

Now the probabilities sum to 1. The reason this matters is because the true comparison isn't:

p_{m o d e l} = p_{r a w i m p l i e d}

Rather, the comparison is closer to:

edge = p_{m o d e l} - p_{f a i r m a r k e t}

That fair market probability can come from no-vig sportsbook odds, a prediction market midpoint, or an executable prediction market price. Even then, an apparent edge isn't enough. The edge has to survive fees, slippage, and other uncertainties that one might deal with when making a trade.

This is especially important in MMA because my model probabilities are uncertain in themselves. If my model says a fighter has a 53% chance of winning, and the market says it's 51%, that's probably not a meaningful edge.

But if my model says 62% and the market says 48%, that's a much more interesting situation, and it could lead to a tradeable action. So, I shouldn't just create a model that predicts which fighter will win. It's not always worth it to bet on a huge betting favorite, or waste money on a huge underdog who actually has no chance of winning even though the EV is incredible. I need to predict the true probability, as well as determine whether or not a bet is worth it based on the market pricing/odds. However, as I stated before, modeling MMA is incredibly challenging.

Why MMA Breaks Traditional Sports Models

The more I worked on this project, the more convinced I became that MMA is perhaps the hardest sport on the planet to model quantitatively. In fact, this is probably why no major quant firms bet on UFC fights.

The reason isn't necessarily because of the sport's inherent randomness. There's definitely some structure in MMA. We do know that better fighters do tend to win, and variables such as style, age, reach, cardio, takedown defense, and knockout power all matter. The problem is that MMA violates many of the mathematical and statistical assumptions that make traditional sports models useful.

In sports like baseball, basketball, football, soccer, tennis, and chess, you usually get a large volume of repeated observations. Baseball players see hundreds of pitches, and basketball teams generate thousands of possessions. Soccer teams play long seasons, and chess players can play hundreds or thousands of games over the course of their careers. Over time, statistics about teams and players in these sports solidify, and the noise in the data starts to average out.

Unfortunately, none of these things are true in MMA.

A good UFC fighter might fight two or three times per year. Many all-time greats in the sport retire with fewer than 30 professional wins. One of the best fighters in the world may have a smaller sample size than what a mid-tier baseball player produces in a month. A famous example is Khabib Nurmagomedov, one of the greatest lightweights of all time. He retired undefeated with a career record of 29-0, but only had 13 fights in the UFC. He retired during his prime shortly after his father's passing, and many wonder if he would've reached greater heights had he continued fighting.

Khabib

And this problem exists across the board in MMA, which creates the central problem:

MMA has real signal, but it also has very little data through which that signal can reveal itself.

And that's only the first problem. There are a ton of other problems, too.

Elo Doesn't Work in MMA

When modeling head-to-head competition, one of the first tools that analysts often reach for is the Elo rating system.

Elo was originally designed for chess, but it has since been adapted to tennis, team sports, esports/video games, and other competitive settings. It's a very simple concept to grasp. Every competitor has a rating. If you win, your rating goes up, and if you lose, your rating goes down. Furthermore, the difference between two ratings determines the expected probability of each competitor winning.

A standard Elo expected score looks like this:

E_{A} = \frac{1}{1 + 10^{(R_{B} - R_{A}) / 400}}

After the match, the rating updates based on whether the competitor performed better or worse than expected:

R_{A}^{'} = R_{A} + K (S_{A} - E_{A})

where:

R_{A} = Fighter A's current rating

R_{B} = Fighter B's current rating

E_{A} = Fighter A's expected score

S_{A} = Fighter A's actual result

K = rating update factor

In a clean setting, this works beautifully. If a strong chess player beats a weaker chess player, the rating barely changes. This prevent strong players from farming ELO off weaker opponents, and won't punish weaker players for losing to people who are expected to beat them.

If a weak player upsets a strong player, the rating changes drastically. Perhaps the "weaker" player is actually quite good naturally, or the "strong" player is much worse than expected, so their respective ELOs will increase/decrease dramatically. Over thousands of games, the ratings of each player begin to converge toward a useful estimate of skill.

At a deeper level, Elo is basically a one-dimensional latent skill model. It assumes that each competitor can be represented by a scalar rating, and that the probability of competitor A beating competitor B is some function of the rating difference.

Something like:

ℙ (A beats B) = σ (c (R_{A} - R_{B}))

where $σ$ is a sigmoid-shaped function and $c$ is a scaling constant.

That assumption creates a clean hierarchy. If:

R_{A} > R_{B} > R_{C}

then the model naturally implies:

ℙ (A beats B) > 0.5

ℙ (B beats C) > 0.5

and:

ℙ (A beats C) > 0.5

This is precisely what you want in chess. If Magnus Carlsen is stronger than an international master (like Levy Rozman), and that international master is stronger than a national master (like James Canty III), then Carlsen should almost always be favored against the national master.

But MMA isn't chess, not even close. The problem with MMA is that fighting skill isn't one-dimensional.

A fighter isn't simply "good" or "bad." We can definitely try defining it that way, but there are way too many styles and aspects of MMA for that to be a meaningful classifier for fighting skill. A fighter may be an elite kickboxer with poor takedown defense, a powerful wrestler with weak striking defense, a submission artist who can't reliably force grappling exchanges, a dangerous first-round finisher with terrible cardio, or a defensively responsible point fighter with limited finishing ability and a bad chin.

Elo struggles in MMA because it tries to place all of those fighters on a single line. But MMA is much closer to a high-dimensional game of rock-paper-scissors than it is to a line with two extremes.

A world-class kickboxer may destroy one opponent at range, then lose badly to a wrestler who can take them down repeatedly and win on control time or ground-and-pound. That wrestler may then lose to a jiu-jitsu practitioner with better submissions. That grappler may then lose to a pressure boxer who can keep the fight standing. The matchups in MMA simply don't form a clean hierarchy.

There's a common adage in the world of boxing and MMA:

Styles make fights.

The saying exists because it's true.

A simple scalar rating model can't naturally represent nontransitive loops. If Fighter A beats Fighter B, Fighter B beats Fighter C, and Fighter C beats Fighter A, a one-dimensional ranking is forced to treat at least one result as an upset or a rating error.

But in MMA, the loop may not be noise or an error of the system. It may be the entire point.

A famous recent example is the triangle between Sean Strickland, Dricus Du Plessis, and Khamzat Chimaev. Dricus Du Plessis beat Sean Strickland twice by piecing up Strickland on the feet with his unorthodox style. Khamzat Chimaev then wrestled Du Plessis and controlled him for 5 rounds straight, winning in one of the most one-sided championship fights in the history of the UFC. And then at UFC 328, Strickland shocked the world by countering Khamzat's wrestling advantage, forcing him to strike on the feet. Strickland ended up winning by split decision, giving Chimaev his first loss in professional MMA and delivering one of the biggest upsets in the history of the sport.

Strickland

If you try to force those results into a clean linear ranking, the loop looks incoherent. But stylistically, it's not incoherent at all. Different weapons interact differently against different opponents.

A more realistic model needs to allow something like this:

ℙ (A beats B) = f (x_{A}, x_{B})

where $x_{A}$ and $x_{B}$ aren't scalar ratings, but full fighter feature vectors.

An even more explicit interaction model might look like this:

ℙ (A beats B) = σ (β^{T} (x_{A} - x_{B}) + x_{A}^{T} M x_{B})

The first term:

β^{T} (x_{A} - x_{B})

captures differences in fighter attributes.

The second term:

x_{A}^{T} M x_{B}

captures interactions between Fighter A's traits and Fighter B's traits.

That interaction term is basically the mathematical version of "styles make fights."

In practice, I'm not literally fitting that exact bilinear model in the current version of my project. My current pipeline uses CatBoost, a gradient-boosted tree method that can learn nonlinear feature interactions directly from tabular data. But conceptually, the fight isn't determined by two scalar ratings, like chess. Instead, it's determined by how two multidimensional fighter profiles collide.

The second issue with Elo is MMA's lack of volume.

Elo needs repeated observations. A chess player can play hundreds or thousands of games across their career. A tennis player can play dozens of matches in a season. A UFC fighter might fight twice in a year.

By the time a fighter's rating has enough data to stabilize, the fighter may already be physically different. They may be older, injured, in a new camp, cutting weight differently, or even past their prime.

MMA doesn't just have small sample sizes. It has small sample sizes attached to humans who change physically and mentally. That makes a pure Elo approach extremely faulty. It can be useful as one feature, but it's not enough to describe the full matchup between two fighters.

Poisson Models Also Don't Work

Another common tool in sports betting is the Poisson distribution.

Poisson models are especially common in lower-scoring sports like soccer or hockey, where analysts often model the number of goals scored by each team. The basic probability mass function of a Poisson distribution is:

P (X = k) = \frac{λ^{k} e^{- λ}}{k!}

where:

λ = expected event rate

k = number of observed events

More generally, if events occur according to a homogeneous Poisson process over time interval $t$ , then:

N (t) ~ Poisson (λ t)

That setup makes three key assumptions:

The observation window is fixed
Events are independent
The event intensity $λ$ is constant

Unfortunately for me, MMA breaks all three of these assumptions.

First, the time interval isn't really fixed. A soccer match has a heavily regulated 90-minute structure. A basketball game has four 12-minute quarters. A baseball game has innings. All of these sports have a fixed time interval. MMA, on the other hand, can have fights that last as long as 25 minutes, leading to a decision. Or, in the case of Jorge Masvidal vs. Ben Askren, it can last as short as 5 seconds.

A 5-round title fight and a 5-second knockout are both "one fight," but they produce radically different amounts of data. If you model fighting as if the same observation window always exists, then you're already distorting the problem.

Second, fight events aren't independent from each other.

A jab isn't just a jab, it's a tool that creates openings for more strikes. A knockdown changes the rest of the round. A failed takedown attempt can exhaust a fighter. A checked leg kick can damage someone's mobility. A body shot can lower the opponent's hands. A cut can change defensive behavior.

MMA is a chain reaction of dependent events. The probability of the next event depends heavily on the previous event.

And finally, the event rate isn't constant.

Fighters don't strike, wrestle, defend, or move at the same rate for 3 to 5 rounds. The first round of a fight may look nothing like the third. Cardio, damage, fear, confidence, coaching adjustments, and positional control all change the rate of events.

A fighter might throw fast, explosive boxing combinations early, slow down after failed takedowns, become more conservative after getting hurt, or suddenly increase output when they're down on the scorecards.

So even if we wanted to define something like:

λ = expected significant strikes per minute

that value isn't stable. A more realistic version would be state-dependent:

λ_{t} = λ (t, S_{t})

where $S_{t}$ represents the state of the fight at time $t$ : fatigue, damage, position, score pressure, distance, stance, and other contextual variables.

At that point, we're no longer in the clean world of a simple homogeneous Poisson model. We're dealing with a dynamic system where the state of the fight changes the distribution of future events.

So Poisson modeling, while elegant, doesn't work at all in MMA.

When Samples are Tiny, Averages Lie

MMA analysis is full of statistics like significant strikes landed per minute, significant strike defense, takedown accuracy, takedown defense, submission attempts per ten minutes, knockdowns per fight, and average fight time.

These numbers are useful, but they can also be deeply misleading.

Suppose a fighter lands five significant strikes per minute. What does that mean?

It might mean they're a genuinely high-output striker. Or it might mean they fought one extremely hittable opponent. Or it might mean they spent three rounds beating up a tired, short-notice replacement fighter. Or maybe they had one five-round war that inflated their career average. Or it might mean that their opponents were unusually willing to stand with them.

We don't have a ton of context to truly make meaning of the statistic unless we actually watch the fight.

This is where the Law of Large Numbers becomes important. In high-volume settings, averages become more reliable because random variation gets washed out over many observations.

A simple way to express this is through the standard error of a sample mean:

S E = \frac{σ}{\sqrt{n}}

where:

σ = standard deviation of the underlying observations

n = number of observations

As $n$ grows, the standard error shrinks. That's why large samples are powerful.

But in MMA, $n$ is usually extremely small.

A fighter might have 15 professional fights, with only 6 of them in the UFC, and only 3 against high-level competition. If they changed weight classes, switched camps, returned from an injury, or took a long layoff, even those few observations may not describe the current version of the fighter.

Another issue is the fact that the observations aren't identically distributed.

A fighter's performance against an elite wrestler and their performance against a short-notice striker aren't draws from the same clean distribution. Opponent quality changes the meaning of every statistic.

So when we write something like:

\bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

to calculate the mean, we should remember that underneath it are different opponents, different rounds, different injuries, different camps, different weight cuts, and different fight states.

That makes historical averages dangerous, at least from a naive perspective. MMA data is extremely sparse, path-dependent, and opponent-dependent.

Skill Is Nonlinear

A lot of traditional models implicitly assume smoothness.

If age matters, maybe every additional year makes a fighter slightly worse. If reach matters, maybe every extra inch gives a small advantage. If strike differential matters, maybe a higher differential linearly improves win probability.

But MMA is full of cliffs, thresholds, and other weird interactions.

Age is the clearest example. The difference between a 25-year-old fighter and a 30-year-old fighter is fairly small. In fact, you could argue the average 30-year-old is better than the average 25-year-old due to experience. In fact, most people would say that the physical prime of a fighter is in their late-20s to early-30s. However, the difference between a 35-year-old and a 40-year-old fighter can be enormous, especially in lower weight classes where speed, reaction time, and durability are crucial.

A linear model might write:

logit (p) = β_{0} + β_{1} \cdot age

That assumes each extra year has the same effect on log-odds. But in MMA, age probably behaves more like a threshold function:

effect(age) \neq β_{1} \cdot age

It may look more like:

effect(age) = {\begin{matrix} small, & if age < 35 \\ large, & if age \geq 35 \end{matrix}

But even that is oversimplified. Heavyweights often age differently from lighter fighters, and their physical prime is expected to be in their mid to late-30s. Grapplers may age differently from explosive strikers. Defensive fighters may age differently from fighters who absorb huge damage.

The same is true for reach. Reach isn't automatically good. It matters more when the fighter knows how to maintain distance. It matters less if the opponent can force clinches or takedowns. It may even become a liability if the fighter with longer reach struggles defensively in boxing range.

The same is true for striking output. High output can be a positive strength, but it can also create openings. A fighter who throws a lot may be easier to counter. A kicker may be easier to take down if they're facing a strong wrestler. A boxer may dominate someone who needs space but run into danger against a powerful counterpuncher.

Most individual features don't mean much in isolation. They only mean something in relation to the opponent, their stats, and their fighting style. That's why MMA modeling is fundamentally a matchup problem, and hence, styles make fights.

What Should a Model Actually Learn?

At this stage of the project, my conclusion was that a useful MMA model can't simply ask, "Who is the better fighter?" That question is just too vague, and doesn't answer anything meaningful. Instead, my model should ask itself, "How do these two specific fighters interact?"

A fighter's raw striking output isn't enough. I care about their striking output relative to the opponent's defense, wrestling threat, durability, and pace. A fighter's takedown accuracy isn't enough. I care about it relative to the opponent's takedown defense, stance, leg reach, scrambling ability, and cardio. A fighter's age isn't enough. I care about age in relation to weight class, style, damage history, and recent activity.

The model needs to learn interactions like:

high striking volume vs. elite wrestling
southpaw vs. orthodox stance dynamics
aging striker vs. younger pressure fighter
powerful finisher vs. defensively sound decision fighter
strong top-control grappler vs. submission threat off the back
low-volume counterstriker vs. high-output point fighter

This is why I moved toward machine learning methods that can capture nonlinear relationships and feature interactions in tabular data. My current version of the project uses CatBoost, but the important point in this first post isn't the specific model choice. The important point is the shape of the problem.

Winner

MMA is sparse, nonlinear, state-dependent, and matchup-driven. That means the model has to be built around those realities instead of pretending the sport is cleaner than it is.

Why Probability Calibration Matters

Once the problem becomes market-facing and ready for trades, it isn't enough to just predict accuracy or winrate.

Suppose a model predicts 100 fights and gets 60 correct. That sounds pretty decent. 60% accuracy isn't bad at all. But for trading, the more important question is whether the probabilities were calibrated properly.

If the model says 10 fighters each have a 70% chance to win, then roughly 7 of them should win over a large enough sample. If only 5 win, the model may still be directionally useful, but it may be overconfident. If 9 win, then the model may be underconfident.

Calibration means that predicted probabilities correspond to real-world frequencies.

Formally, for a calibrated model:

ℙ (Y = 1 ∣ \hat{p} = p) = p

Essentially, among all events where the model predicts probability $p$ , the event should happen approximately $p$ percent of the time.

This matters because expected value depends directly on the probability estimate. If my model says a fighter has a 65% chance to win, but the true probability is closer to 55%, then every downstream trading decision is corrupted.

A common way to evaluate probabilistic predictions is the Brier score:

Brier Score = \frac{1}{N} \sum_{i = 1}^{N} ({\hat{p}}_{i} - y_{i})^{2}

where:

{\hat{p}}_{i} = predicted probability for fight i

y_{i} = actual outcome for fight i

In essence, a lower Brier score means the predicted probabilities are closer to the realized outcomes.

Another common metric is log loss:

Log Loss = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} \log ({\hat{p}}_{i}) + (1 - y_{i}) \log (1 - {\hat{p}}_{i})]

Log loss punishes confident, wrong predictions heavily. That's useful in a market setting because overconfidence is very dangerous. A model that repeatedly makes 90% predictions and gets them wrong isn't just inaccurate, it's extremely dangerous financially.

This is one of the reasons I find this project interesting. The project is about making calibrated probability estimates under uncertainty, not about classifying who wins/loses.

Embracing the Chaos

MMA is hard to model because it refuses to behave like a clean statistical problem.

Elo is useful, but a single rating can't capture the multidimensional geometry of fighting. Poisson models are neat and elegant, but fights aren't stable event-rate processes. Historical averages are great, but fighter samples are tiny and heavily contextual. Linear assumptions are convenient, but combat sports are full of thresholds, cliffs, and interaction effects.

The betting side is also harder than it looks. A market price isn't automatically a fair probability. Sportsbook odds include vig. Prediction markets have spreads, fees, liquidity constraints, and execution risk, meaning a small apparent edge can disappear once those factors are included.

A good MMA model has to respect the chaos, without surrendering to it completely. It has to admit that the data is sparse, the sport is uncertain, and a single mistake can end the fight. But it also has to recognize that a fight isn't purely random. There's structure in the matchups, there's signal in the features, and there are probabilities hiding underneath the noise. There is a method to the madness.

Most importantly, the model doesn't have to predict every fight perfectly. The true goal is to estimate the probability of a fight outcome better than the market, often enough to matter. That is where you will find profitable market opportunities.

That's the main project I've been working on at RC. It started from the question of whether math can predict a fist fight, which has led toward building a system that can estimate fair value in one of the messiest sports markets on the planet.

In the next post, I'll talk more about the actual model, including the dataset, feature engineering, CatBoost pipeline, calibration, confidence buckets, and how I think about turning fight predictions into market decisions.