Option FanaticOptions, stock, futures, and system trading, backtesting, money management, and much more!

An Argument for Statistics (Part 3)

I left off with a general description of the statistical hypothesis testing process.

Once the assumptions behind a sample or experimental design are identified, the next step is to choose an appropriate statistical test to run on the data.

Select a level of significance (greek letter alpha: α), which is a probability threshold below which the null hypothesis will be rejected. Common values used are 0.05 or 0.01. By definition, status quo is likely to maintain. α states “if the chance of the sample being status quo is less than one in 20 (or 100, respectively), then I believe it is not status quo (i.e. reject H0) but rather something different (i.e. accept HA).”

Perform the statistical test, which will output a p-value. The p-value gives the probability of the groups being from the same population (e.g. no difference, or H0 is true). If the p-value < α then reject H0 and accept HA.

Hypothesis testing is not perfect. A type I error occurs by rejecting H0 when H0 is in fact true. This is also known as a “false positive” and the probability of making this mistake is equal to α. On the flip side, a type II error occurs by not rejecting H0 when H0 is in fact false. This is a “false negative.”

People spend so much time backtesting trading strategies but I believe without statistics, essential context is missing to make sense of it. As an example, here is some data I saw recently:

Sample backtesting results (12-18-15)

With regard to average trade, groups A and D look best but we need something more to conclusively determine. An average trade PnL of $15 for groups A and B is 50% more than groups B and C! Is that a real difference or is it likely to have occurred by chance? Sample sizes would affect our evaluation of this question as would variance within/between the groups. Inferential statistics wrap all these factors together into context we can definitively understand. Without the inferential statistics we really can’t know much at all.