Blog / Methodology

Why your backtest passes —
and your live account doesn't

Out-of-sample testing isn't a technicality. It's the only honest way to know if a strategy actually has an edge — and most retail traders skip it entirely.

Strategy equity curve showing in-sample vs out-of-sample performance, 2005–2025
The gap between how a strategy performs in-sample vs out-of-sample tells you more than the backtest numbers ever will.

The backtest trap

Every strategy looks good in backtests. That's not a compliment.

A backtest that uses all available data for both development and evaluation is guaranteed to "work" — because you've essentially solved the same test you're grading yourself on. The strategy isn't predicting anything. It's describing what already happened.

The problem isn't the backtest tool. It's the process. Most traders develop a strategy, run it on five years of data, adjust the parameters until the equity curve looks the way they want, and call it done. What they've actually built is a model that describes the past. Not one that predicts the future.

If every strategy you build passes your process, your process isn't filtering anything.

What out-of-sample testing actually means

Before I start developing any strategy, I lock a portion of the price history away. No parameters are tuned against it. No entries, exits, or filters are tested on it. That data doesn't exist until development is finished.

Once the strategy is built and validated on the in-sample window, I run it once — just once — on the out-of-sample data. Whatever the result is, that's the result. No re-optimisation. No "let me just adjust that one parameter." If it fails OOS, the strategy is rejected. That's the whole point.

The OOS window is typically the most recent 20–30% of the available history. This matters: you want the strategy to have seen less data, not more, during development. More data during development just means more opportunity to overfit.

Why a high rejection rate is a feature

I reject more than 80% of the strategies I develop. That number isn't something I'm embarrassed about — it's the mechanism.

If you're not rejecting most of what you build, you're not testing. You're curating. The goal is to fail most strategies quickly, cheaply, and before you trade real money on them — not to find a way to make every strategy look good enough to publish.

The strategies that survive have passed a filter specifically designed to catch overfit curves. The ones that fail are exactly what they look like: backtests that worked on the data they were tuned on, and nothing else.

The three tests every EdgeLab strategy passes

Out-of-sample performance is the first filter — but not the only one. After OOS, every strategy goes through two more validation steps:

What to do if you're building strategies now

If you're running strategies without a locked OOS window, you're not testing. You're fitting. The fix isn't more data or a better backtest tool — it's committing to the process before you start, not after the equity curve already looks good.

Set aside the last 20% of your available history before you open the backtest. Don't look at it. Don't reference it. Develop your strategy entirely on the in-sample data, then run a single forward test on the OOS window when you're done. Whatever comes out, that's your real result.

If that process kills most of what you build — it's working.

Robin Eriksson
Robin Eriksson
Founder, EdgeLab

Get a free tested strategy

Every strategy I publish has passed OOS testing, sensitivity analysis, and Monte Carlo simulation. The first one is free.

Get my free strategy