Features: Garbage In, Garbage Out
The model is only as good as what you feed it. Building features that carry real signal.
In ML, “features” are the input variables you feed the model (momentumBuying recent winners and avoiding recent losers., valuationEstimating what an asset is worth. ratios, volatilityThe size of price swings — not their direction., etc.). Feature engineering — choosing and constructing good features — is where most of the real value is created, far more than the choice of fancy algorithm.
- Features matter most — value lives in what you feed the model, far more than the algorithm choice.
- Encode real signal — good features capture economic/behavioural drivers (factorsTilting a portfolio toward traits that have historically paid. make great features).
- Point-in-time discipline — features must use only data available at the time (no look-ahead leakage, Module 3).
- Less is more — a few meaningful, leak-free features beat hundreds of noisy/redundant ones (and reduce overfitting).
Should I throw hundreds of features at the model and let it sort them out?
No — that invites overfitting and noise-fitting, especially in markets. Prefer a *small set* of meaningful, economically-grounded, leak-free features. More features mean more ways to fit noise and more chances of hidden look-ahead. Disciplined feature selection beats brute-force feature dumping, which is a classic route to a great-backtest-then-live-failure ML model.