Option FanaticOptions, stock, futures, and system trading, backtesting, money management, and much more!

Crude Oil Strategy Mining Study (Part 3)

Today I will start to analyze results of my latest study on crude oil.

I ended up running four 3-way ANOVA tests. Recall the factors I am testing:

I am running these analyses on five dependent variables:

The first four are performance-related while the last is for curiosity.

Along the lines of “conventional wisdom,” here are some hypotheses:

  1. Four-rule strategies should outperform two.
  2. Best strategies should outperform worst.
  3. Long (short) strategies should outperform if the market climbs (falls) during the incubation period.
  4. Order of IS and OOS periods should not make a difference since strategies are selected on performance over both.
  5. Number of trades should be fewer for 4-rule than for 2-rule strategies (fifth-to-last paragraph here).
  6. The software is capable of building profitable strategies.

Here are the results:

ANOVA significance summary (7-14-20)

Let’s begin with somewhat of an eye-opener: 2-rule strategies averaged PNLDD 0.15 and PF 0.99 vs. 4-rule strategies with PNLDD 0.02 and PF 0.95. I think this is surprising for two reasons. First, the simpler strategies did better (see hypothesis [1]). Second, the signs are misaligned for the 2-rule group. I checked for sign agreement on every strategy; how can overall PNLDD reflect profit when overall PF reflects loss? The answer is because strategy drawdowns along with the relative magnitude of gains/losses all differ. When averaged together, sign agreement may no longer follow.

Consider this example of two strategies with two trades each:

Apparent PNLDD and Avg Trade contradiction (7-15-20)

Signs align between PNLDD and Avg Trade for each strategy, but when averaged together the signs do not align. PNLDD may be a decent measure of risk-adjusted return, but it cannot be studied alone: sign must be compared with Avg Trade (or PNL). Forgetting this is like adding numerators of fractions with unequal denominators (i.e. wrong).

If I do nothing else today, then this discovery alone makes it an insightful one.

Getting back to the first factor, R does not significantly affect PNL or Avg Trade. PNL is almost identical over 816 strategies (-$4,014 vs. -$4,015). This is a good argument for risk-adjusted return (like PNLDD) as a more useful metric than straight PNL even given constant contracts (one, in this study). With regard to Avg Trade, 2-rule strategies outperform $33 vs. -$41. This difference is not significant.

Here’s something else to monitor: all results so far are negative (see hypothesis [6]).

I will continue next time.