Calibration
Probability vs Reality
When the model says 60%, does the bet actually win 60% of the time? Below: each point is a decile bucket. Points on the diagonal = perfectly calibrated.
ECE
4.39%
weighted bucket gap · <5% is good
Brier Score
0.2089
MSE · <0.25 beats coin flip
Settled Bets
262
contributing to calibration
Reliability Diagram
Predicted probability → actual win rate
Per-bucket breakdown
| Prob bucket | Predicted | Actual | Gap | n |
|---|---|---|---|---|
| 10–20% | 14.9% | 16.7% | -1.7pp | 12 |
| 20–30% | 25.7% | 13.8% | +11.9pp | 29 |
| 30–40% | 35.4% | 33.3% | +2.0pp | 27 |
| 40–50% | 44.7% | 56.3% | -11.6pp | 32 |
| 50–60% | 55.6% | 52.3% | +3.4pp | 44 |
| 60–70% | 64.2% | 63.8% | +0.4pp | 58 |
| 70–80% | 74.6% | 76.2% | -1.6pp | 42 |
| 80–90% | 83.5% | 86.7% | -3.2pp | 15 |
| 90–100% | 92.5% | 66.7% | +25.8pp | 3 |
