Cross-Validation for Strategies
Borrowing a machine-learning trick to test a strategy on many independent slices of history.
Cross-validationTesting a model on data it wasn’t trained on. is a machine-learning technique adapted for strategy testing: instead of one in/out-of-sample split, you test across many different slices of history, so your verdict doesn’t hinge on one arbitrary choice of test period.
- The idea — test across many slices of history, not one split, and aggregate for a stable estimate of the edgeA repeatable, structural reason your trades win over time..
- Why — one test window can be luckily kind or cruel; many windows average out that luck.
- The market caveat — never shuffle time-series data freely (it leaks the futureA binding agreement to buy or sell at a set price on a future date.); use time-ordered CVTesting a model on data it wasn’t trained on. (train only on the past).
- Refinements — purging/embargo gaps between train and test prevent subtle leakage across adjacent periods.
Why can’t I use standard k-fold cross-validation on market data?
Standard k-fold randomly shuffles data, which in a time series means training on *future* data to predict the *past* — a fatal look-ahead leak that produces fake-good results. Markets require time-aware variants (walk-forward, purged/embargoed cross-validation) that strictly preserve chronological order, so the model is only ever validated on data that came *after* what it learned from.