Calibration

Probability vs Reality

When the model says 60%, does the bet actually win 60% of the time? Below: each point is a decile bucket. Points on the diagonal = perfectly calibrated.
ECE
4.39%
weighted bucket gap · <5% is good
Brier Score
0.2089
MSE · <0.25 beats coin flip
Settled Bets
262
contributing to calibration
Reliability Diagram
Predicted probability → actual win rate
Per-bucket breakdown
Prob bucketPredictedActualGapn
10–20%14.9%16.7%-1.7pp12
20–30%25.7%13.8%+11.9pp29
30–40%35.4%33.3%+2.0pp27
40–50%44.7%56.3%-11.6pp32
50–60%55.6%52.3%+3.4pp44
60–70%64.2%63.8%+0.4pp58
70–80%74.6%76.2%-1.6pp42
80–90%83.5%86.7%-3.2pp15
90–100%92.5%66.7%+25.8pp3