All you want to know about back testing

The quality of back testing is an extremely critical aspect contributing to the success of a software trading system.

Back testing is nothing but simulating the trading on historical data to determine the likely profitability and risk of the system.

The simulation has to be done in a careful and statistically valid manner. Poor quality back testing will produce a system that performs well in a simulation but fails miserably during live trading.

Broadly speaking, back testing is done in phases.

Work with the 'in sample' period
Data from a particular period is selected to develop and test the system. This is technically referred to as the 'in sample' period. During this phase of testing, various rules are fed into the system and are improvised upon to arrive at the best computation. The results are used only to develop the system and not check its validity.

Test in other markets
Once a desired level of profitability in the 'in sample' period has been arrived at, it is a good idea to simulate trading in other markets as well. For example, if a system is being designed for the Indian stock market it should at this stage also be tested in, say, the other Asian markets.

Validate the system
Once the system performs well even in other markets, it is time to validate it. To achieve this, simulation should be done in an 'out of sample' period, distinct from the 'in sample' period. In other words, it's a time period which has not been employed by the 'in sample' period. In this period, the trading rules cannot be varied else the test would be invalidated. If the system developer varies the rules, his real life returns will probably not match his test results.

The final test
After the system shows profits in the 'out of sample' period, one needs to ensure that the results were not the outcome of pure chance. One way of doing this is by using the T test.
A high T test score would be the result of:
1. Higher the number of trades in the test period, greater the confidence level.
2. Higher the profitability in the 'out-of-sample' period, the higher the probability that it's a true system and not a stroke of luck.
3. Well distributed profits amongst all the trades. If, for instance, all the profits come from one big trade, then the level of confidence indicated is low.

A system is deemed to be valid for use in live trading only if the T test indicates a high level of confidence, say 90%.