WealthJot.ai

Overfitting: Memorising the Past

advanced8 min read

Add enough rules and any strategy looks perfect on history and useless tomorrow. How to catch it.

Overfitting (or “curve-fittingTailoring a strategy so closely to the past it fails on the future.”) is tailoring a strategy so closely to historical data that it memorises the past instead of learning a real pattern. An overfitted strategy looks flawless on the data it was built on and falls apart on anything new — the central disease of quantitative trading.

The core insight: every market history contains both signal (real, repeatable patterns) and noise (random coincidence) — and overfitting is fitting the noise. The more rules, parameters and conditions you add, the more perfectly you can describe one specific past, including its accidents — “buy on Tuesdays when RSI is 47.3 and volumeThe number of shares or contracts traded in a period. is above the 19-day average.” Such a strategy isn’t finding an edgeA repeatable, structural reason your trades win over time.; it’s drawing a line through every dot, including the random ones. Because noise never repeats, the overfitted strategy is useless on the futureA binding agreement to buy or sell at a set price on a future date.. The cruel paradox: the more impressive your backtestTesting a trading strategy on historical data. (more rules → smoother curve → higher returns), often the more overfitted and worthless it is. The defences are simplicity (few parameters), *out-of-sample testingTesting a strategy on data it was never built on. (does it survive unseen data?), and a named economic reason* it should work (signal has a cause; noise doesn’t).
  • What it is — fitting the noise in one specific history, so it memorises rather than generalises.
  • The tell — many parameters/conditions, a suspiciously perfect curve, and no explanation for why it works.
  • The paradox — a more impressive backtestTesting a trading strategy on historical data. is often a more overfitted (and more useless) one.
  • The defences — fewer rules (parsimony), out-of-sample/walk-forward testing, and a real economic mechanism.
ExampleYou keep adding filters until a strategy shows a perfect 60% CAGRCompound Annual Growth Rate — the smoothed yearly return. with no losing years on 2015–2020 — ten parameters finely tuned. Run it on 2021–2023 and it loses money. It didn’t learn an edgeA repeatable, structural reason your trades win over time.; it memorised 2015–2020’s exact quirks. A simpler 3-rule version with a clear rationale that returned a “boring” 14% would likely have survived.
Common mistakeTweaking and adding rules *until the backtestTesting a trading strategy on historical data. looks great*. That process is overfitting by definition — you’re optimising to the past, noise included. Each added parameter should earn its place with a real reason and survive out-of-sample, not just prettify the historical curve.
Key takeawayOverfitting fits the noise in one history, memorising the past instead of learning a real pattern — so it’s perfect in-sampleThe data a model was built and fitted on. and useless live. The more impressive (over-parameterised) the backtestTesting a trading strategy on historical data., often the more overfitted. Defend with simplicity, out-of-sample testingTesting a strategy on data it was never built on., and a named economic reason it works.
FAQs
How many parameters is “too many”?

There’s no hard number, but fewer is almost always safer — each parameter is another chance to fit noise. A robust strategy usually has a *handful* of rules with clear rationale and works across a *range* of settings (parameter insensitivity, a later lesson). If your edge depends on precise “magic numbers” and many conditions, suspect overfitting.