Probability of Backtest Overfitting
A formal way to estimate how likely your great backtest is a fluke — and act on it.
Probability of BacktestTesting a trading strategy on historical data. Overfitting (PBO) is a formal technique to estimate the chance that your impressive backtestTesting a trading strategy on historical data. is a fluke that won’t survive live. It turns the vague worry “am I overfitting?” into an actual probability you can act on.
- What it measures — the probability your in-sampleThe data a model was built and fitted on. “best” strategy is a fluke that underperforms out-of-sample.
- How — many in/out-of-sample splits; check how often the in-sampleThe data a model was built and fitted on. winner ranks below median out-of-sample.
- Reading it — high PBO (>~0.5) = likely overfit/luck; low PBO = the edgeA repeatable, structural reason your trades win over time. tends to persist (more trustworthy).
- The point — a quantified humility meter; the discipline is to reject high-PBO strategies, not rationalise them.
Do I need PBO for every strategy I build?
Not always formally, but its *mindset* is essential: always ask how much your result depends on having picked the luckiest configuration. For serious, optimised strategies — especially ones found by searching many variations — a PBO-style analysis (or at least rigorous walk-forward and out-of-sample testing) is invaluable for separating a real edge from an expensive illusion.