Data Quality & Adjustments
Splits, dividends, bonus issues and bad ticks — why dirty data quietly ruins backtests.
A backtestTesting a trading strategy on historical data. is only as trustworthy as the data underneath it — and raw market data is full of traps that silently corrupt results. “Garbage in, garbage out” applies brutally here, because dirty data often produces plausible-looking (but fake) results rather than obvious errors.
- Corporate actionsA company event that affects its shares. — splits, bonuses and consolidations mechanically change price; use adjusted data or your test sees phantom crashes/spikes.
- DividendsA cash payout of company profits to shareholders. — total-return vs price-only data changes results; be consistent and explicit about which you use.
- Bad ticks & gaps — erroneous prints and missing bars can trigger fake signals; clean and sanity-check the data.
- Survivorship — datasets that exclude delisted/dead companies inflate results (covered in the bias module); include the graveyard.
Where do data problems most often hide?
In corporate-action adjustments (phantom crashes from splits/bonuses), in survivorship-biased datasets (silently missing dead companies), and in subtle point-in-time errors (restated fundamentals). These rarely throw obvious errors — they produce *believable* numbers, which is exactly why they’re dangerous. Always verify your data source handles adjustments and includes delisted names.