WealthJot.ai

In-Sample vs Out-of-Sample

intermediate7 min read

Build on one slice of history, test on a slice you never saw. The discipline that catches self-deception.

The single most important discipline in backtestingTesting a trading strategy on historical data. is splitting your history into two parts: *in-sampleThe data a model was built and fitted on. data (which you use to build and tune the strategy) and out-of-sample data (which you set aside and test on only after* the strategy is finalised, never having seen it).

Out-of-sample testingTesting a strategy on data it was never built on. is your defence against fooling yourself — it’s the trading equivalent of grading a student on questions they didn’t see while studying. If you build and judge a strategy on the same data, of course it looks great: you (knowingly or not) shaped it to fit that exact history. The honest question is never “does it fit the past I built it on?” but “does it work on data it has never seen?” A strategy that shines in-sampleThe data a model was built and fitted on. but falls apart out-of-sample was overfitted — it memorised noise, not signal. This split is what separates a discovery from a delusion. The iron rule: touch the out-of-sample data only once, at the very end — every time you peek and re-tweak, you contaminate it, turning your “test” back into more fitting.
ExampleYou build a strategy on 2010–2018 data (in-sampleThe data a model was built and fitted on.) and it returns 22% CAGRCompound Annual Growth Rate — the smoothed yearly return.. The real test: run it untouched on 2019–2023 (out-of-sample). If it still returns ~18%, you likely have a genuine edgeA repeatable, structural reason your trades win over time.. If it collapses to −5%, you overfitted 2010–2018’s noise. Only the data you didn’t build on could tell you.
Key takeawaySplit history into in-sampleThe data a model was built and fitted on. (build/tune) and out-of-sample (test only, never seen). A real edgeA repeatable, structural reason your trades win over time. survives on data it never saw; one that shines in-sampleThe data a model was built and fitted on. but dies out-of-sample was overfitted. Touch the out-of-sample set once, at the end — peeking and re-tweaking destroys its value.
FAQs
What if my strategy fails out-of-sample — can I just adjust it?

If you re-tune based on out-of-sample results, that data is now *in-sample* (you’ve fitted to it), and you no longer have an honest test. The disciplined response is to go back to the drawing board with a *new* hypothesis and reserve a *fresh* untouched slice — or use walk-forward analysis (later module), which formalises repeated honest testing. Don’t quietly fit to your “test” set.