How the BreakPoint ML Model Works
The Challenge of Tennis Prediction
Tennis is uniquely difficult to predict. Unlike team sports, individual matchups matter enormously. A player who dominates on clay might struggle on grass. Recent form, surface expertise, and head-to-head history all play critical roles.
At BreakPoint, we built an ensemble ML model that combines the strengths of multiple prediction approaches to tackle this complexity.
Our Two-Layer Approach
Layer 1: Glicko-2 Rating System
Every player in our database has a surface-specific Glicko-2 rating — separate ratings for clay, grass, hard indoor, and hard outdoor. The Glicko-2 system improves on traditional Elo in two important ways:
- Rating Deviation (RD): Measures how certain we are about a player's rating. A player who hasn't played in months has a higher RD, meaning their rating is less reliable.
- Volatility: Captures how consistently a player performs. A player with erratic results has higher volatility.
Layer 2: ML Ensemble (Logistic Regression + XGBoost)
Our ML model takes a 10-feature vector for each matchup:
| Feature | Description | |---------|-------------| | Rating difference | Glicko-2 gap between players | | Player 1 RD | Rating confidence for player 1 | | Player 2 RD | Rating confidence for player 2 | | Form difference | Recent win rate gap (last 10 matches) | | Surface win rate (P1) | 2-year surface-specific win percentage | | Surface win rate (P2) | 2-year surface-specific win percentage | | Surface WR difference | Gap in surface expertise | | Months inactive (P1) | Time since last match | | Months inactive (P2) | Time since last match | | Tournament round | Round encoding (R32 through Final) |
The ensemble combines Logistic Regression (good at linear relationships) with calibrated XGBoost (captures non-linear patterns like surface specialization).
Why Calibration Matters
Raw ML probabilities are often overconfident. Our XGBoost model originally pushed predictions to extremes — predicting 85% win probability when the true rate was closer to 70%.
We solved this with isotonic calibration (CalibratedClassifierCV), which maps raw probabilities to their true observed frequencies. Our Expected Calibration Error (ECE) dropped to just 0.025, meaning when we say "70% probability," the player actually wins about 70% of the time.
The Market-Model Blend
Here's our most important insight: the betting market is efficient, but not perfect. We found that our model systematically overvalues underdogs — a common bias in ML prediction systems.
Our solution is a two-tier blending strategy:
- Normal matches: 60% model probability + 40% market-implied probability
- Heavy favorites (>80% implied): 40% model + 60% market
This correction significantly improved our real-world accuracy while preserving our edge in closer matchups where the model adds the most value.
Finding +EV Opportunities
An opportunity exists when our blended probability exceeds the vig-normalized implied probability from bookmaker odds. We normalize for vig (the bookmaker's margin, typically 2-3%) to avoid counting phantom edges.
Our thresholds:
- 4%+ edge: Recommended bet
- 3-4%: Borderline (worth watching)
- >10%: Suspicious (likely stale odds or model limitation)
What's Next
We're continuously improving our model with:
- Expanded training data (more historical matches)
- Weather and court speed factors
- Player fatigue modeling (tournament scheduling)
- Real-time odds movement tracking
Want to see the model in action? Check out today's opportunities or run your own prediction.
Ready to find your edge?
Use ML-powered predictions and +EV opportunity scanning to make smarter tennis bets.
View Today's Opportunities