Option FanaticOptions, stock, futures, and system trading, backtesting, money management, and much more!

Butterfly Backtesting Ideas (Part 1)

I have completed one exhaustive butterfly backtest on dynamic iron butterflies (DIBF). While helpful for offering up some context, it left much to be desired.

Butterflies seem to be all the rage in trading communities these days so the main reason my backtest failed to impress is because the results were inconclusive. Slippage really made the difference between a trade that was profitable and one that was not. While tantalizing to think I can overcome slippage by simply entering a GTC limit order and waiting for a fill, unables do occur. Backtesting cannot fully determine the impact of unables primarily due to limited granularity of data (30-minute intervals).

I have some methodological issues that may have negatively impacted the results. The dynamic nature of the strategy means some trades were symmetric and others were asymmetric. An asymmetric butterfly will have a lower max loss potential to the upside. Even though most losses seem to have taken place on the downside, having a much larger upside loss potential (100%) hurts because the downside loss potential is the same either way (100%).

Aside from some trades being symmetric, those that were not had varying degrees of asymmetry. The greater the asymmetry, the lower potential loss to the upside in terms of ROI (%). Perhaps this should be standardized.

The need for standardization feeds directly into the next issue: use of percentages (ROI) instead of PnL. Because margin requirements ranged from $1,401 to $12,400, I used percentages to avoid having to normalize (e.g. two contracts of a $5,000 trade equates to one contract of a $10,000 trade). ROI is unaffected by margin requirements. Now consider a downside loss. Asymmetric and symmetric butterflies can both experience -100% ROIs when PnL is [much] worse for the asymmetric due to the embedded put credit spread. This doesn’t feel right.

One thing I could do with the DIBF backtest is normalize for margin requirement then recalculate the trade statistics based on PnL. This might serve as confirmation that I was on the right track with the initial analysis.

Rolling Naked Puts (Part 2)

Last time I presented some data in an attempt to replicate the Market Measures (MM) episode from November 8, 2016.

The backtesting methodologies are different. MM started with 0.30-delta naked puts. They took assignment at expiration and sold a 0.30-delta call against it. I started with 0.20-delta naked puts and whenever a 3x stop-loss was hit, I rolled out to the next available expiration month.

My reason for rejecting the roll adjustment had to do with the larger max drawdown (DD) and standard deviation (SD) of returns. MM did not present these statistics.

The Tasty Trade mantra “trade small and trade often” alleviates the DD/SD concern. This is an effective means to cover their *ss because no one loss would ever be catastrophic. I feel this works for people trading part-time as a hobby who have an independent paycheck consistently coming in.

I do not believe “trade small and trade often” works for people trading full-time as a business, however. Unless capitalized with millions of dollars, one cannot trade “small” and still be able to cover living expenses. When position sized as a viable business, most traders do not have enough diversified strategies to avoid widespread portfolio devastation should a naked put max DD type of event occur. This is why I believe max DD and SD of returns are necessary to design a workable trading plan.

My backtest included 509 trades that hit a 3x stop-loss but excluded an additional 376 trades that hit the 2x stop-loss level. Naked puts that lose on the roll are those where the market does not recover or continues to tank lower. These are the most severe losses and are included in the present backtest. Backtesting trades hitting the 2x stop-loss that do not go on to hit 3x would therefore probably improve overall results.

I do not believe inclusion of the 2x stop-loss trades would be enough to save the rolling strategy, however. While I would expect to see more than a 15% improvement, even if PnL improves by up to 50% (for example), the 4x larger max DD realized when rolling would necessitate a decrease in position size by 80% to equate the DD’s and 80% >> 50%.

Rolling Naked Puts (Part 1)

The motivation for this study comes from the Market Measures (MM) episode discussed here. Is rolling naked put losers a viable strategy for improving trade results?

Here are some important details from my study:

Here’s what I found:

20-delta naked put rolling comparison on trades hitting 3x stop-loss (2-20-17)

Out of 509 trades, only 200 lose on the roll. This represents a 60% reduction in number of losers as shown in the third column. Rolling reduces net loss by about 15% and the average trade improves on the roll by the same amount (rows 4-5). That is the good news.

Bad news starts with row 6: the worst loss increases by over 400% with rolling. The standard deviation (SD) of trade results also increases over 650%. I have discussed many times how maximum drawdown (DD) and SD both represent risk (e.g. here, here, and here). If I position size based on max DD then I would have to trade five times smaller with rolling than without. A 15% improvement in PnL is hardly going to compensate for that.

Days in trade (DIT) is obviously larger with the roll. It more than triples, though, and this will dilute the PnL improvement. Three non-rolling trades could be done in the roughly 74 days it would take for one rolling trade. With a win rate over 84%, odds are the sum of three trades will better the PnL of the rolled trade.

Looking back to the MM presentation, the biggest difference between my analysis and theirs is what they did not present. I have detailed this critique elsewhere. Like the MM episode, I looked at success rate, average PnL, and DIT. I also discussed max DD and SD of results, though, where MM did not. As it turns out, these are the statistics I find to be most decisive and they are the biggest reason I believe rolling is not viable. MM arrived at the opposite conclusion.

I don’t believe the full picture of rolling can be understood without analysis of max DD and PnL SD. Without them you may be headed for an awful surprise when max rolling DD is experienced with real money! That would not make for a pleasant day.

I will continue next time.

Professional Performance (Part 2)

Last time I discussed management of other people’s money. Today I want to focus on my recent trading performance.

On September 21, 2015, I began trading my personal account very similar to the way I would professionally manage money for others. I have traded every single day while adhering to a defined set of guidelines for opening trades. The little flexibility I have maintained with regard to position sizing and closing trades would be omitted as a professional money manager. Rather than using discretion, if someone wanted to squeeze out more return then I would discuss the possibility of a larger portfolio allocation to my services.

Here is a graph of net ROI from 9/21/15 through the end of 2016:

Trading performance vs benchmark (ROI) (9-21-15 thru 12-30-16)

Over 15+ months, I have outperformed the index 29.2% to 16.8%.

Here is a graph of maximum drawdown (DD):

Trading performance vs benchmark (max DD) (9-21-15 thru 12-30-16)

In addition to a larger total return, my max DD was smaller than the index: -10.0% vs. -20.7%. February 2016 offered a moderate market pullback and my ability to keep DD in check resulted in a 3.6x better risk-adjusted return (ROI divided by max DD): 2.92 vs. 0.81.

Because ROI and max DD are both a function of position size, I graphed daily portfolio margin requirement (PMR) as a percentage of account value:

Trading performance (PMR percentage) (9-21-15 thru 12-21-16)

PMR ranged from 8.27% to 62.2% with an average (mean) of 28.6% (standard deviation 11.9%).

Professional Performance (Part 1)

A couple months ago I did a long-awaited performance update. Today I want to focus on the last part of that record.

Over the last few years, I have contemplated the idea of trading for others. The main thing holding me back is doubt that others will find me credible. I consider myself an industry outsider (i.e. have never worked for a financial firm) and I don’t have an official track record.

One way to generate an official track record would be to create an incubator fund. My understanding is trading the incubator fund would be no different from what I currently do except that it would be audited by an expensive accountant. An accountant with a solid reputation may leave the performance record with more credibility.

Then I read posts like this and get really discouraged:

     > Unfortunately with hedge funds, you cannot
     > use any track record from a previous fund
     > or personal trading (even if audited). So if
     > you are a famous fund manager you will raise
     > all you want off your reputation because
     > everyone is aware of the returns you have
     > generated in the past—returns that cannot
     > be used to market the new fund. With regard
     > to an audit, the minimum cost of a respected
     > firm is $30K per year. It is supposedly
     > expensive because of their liability. But it
     > would be a waste of time anyway because
     > institutions don’t pay attention to funds
     > smaller than $40M. Even the funds of funds
     > looking to invest in small time start-ups
     > don’t even peek if you’re under $10M. So it’s
     > completely on you to impress others enough
     > to raise the initial millions and start building
     > that track record. Start-ups with little capital
     > face those headwinds of large audit/tax/legal
     > expenses. If you don’t have “friends and family”
     > willing to contribute based on their years of
     > knowing you then your only hope is to
     > outperform most other hedge funds out there,
     > which is not easy to do.

Does this guy know what he’s talking about? I really don’t know but it does not sound encouraging.

I will continue next time.

Musings on Naked Puts in Retirement Accounts (Part 2)

Today I want to continue the comparison between vertical spreads and naked puts (NP) to better understand the pros/cons when traded in retirement accounts.

Employing leverage makes for a more compelling IRA strategy but a very clear and present danger exists. Look at the graph shown in the previous post. At expiration, a 100% loss on the vertical spread loss will be incurred if the market falls to 419. The naked put, in this case, will have lost no more than (497 – 419) / 497 * 100% = 16%.

Market crash scenarios must therefore be considered. Throughout history, the market has periodically incurred drops equal to or greater than the magnitude just described. I must limit position size as an attempt to prevent total spread risk from striking too damaging a psychological blow to my total net worth in case this should occur.

Wrapping my brain around the concept of leverage has been challenging. In the first blog post hyperlinked above, I wrote:

> Suppose I sell a 1000 put for $3.00 and buy a 500 put
> for $0.30. I have sacrificed 10% of my potential return
> to halve my risk. If I traded two of these spreads then
> I have similar risk to the single naked 1000 put and my
> potential profit is $2.70 * 2 = $5.40 instead of $3.00.

The italicized clause is correct: risk in either case is roughly $1,000 * $100/contract = $100,000. However, the market must crash to zero for the NP to realize max loss. The market must only drop to 500 for max loss to be realized on the vertical spread. This is extremely rare but think back to the 2008 financial crisis for a point of reference.

In Part 1 of the link hypertexted above, I wrote:

> A leveraged account can go to zero long before the
> underlying assets do.

Leverage is dangerous because losses are magnified when the market moves against me. This is the flip side of what makes leverage attractive: lowering the cost to enter a position.

The vertical spread is like a NP on steroids. While total risk is decreased (assuming constant position size), the probability of losing everything at risk is increased. For this reason and because a NP qualifies under the “unlimited risk” umbrella, my instincts recommend limiting portfolio allocation for these short premium strategies to 20%.

I think the vertical spread can offer one additional benefit in case of that dreaded market crash. This I will cover next time.

Musings on Naked Puts in Retirement Accounts (Part 1)

I am not a proponent of trading naked puts (NP) in retirement accounts. The addition of a long put converts the NP to a put vertical spread. Might the vertical be a candidate for retirement account trading?

My argument against NPs in retirement accounts begins with the observation that retirement accounts cannot be margin accounts. I was unable to find a particular regulation that prohibits this but I don’t know of any brokerage that allows an [Roth] IRA to support any kind of loan. Margin is a loan, which would therefore be prohibited in an [Roth] IRA account.

Being resigned to trade NPs in a cash account simply does not seem like an attractive use of capital. If I have a $100,000 account then I can only sell one 1000 NP. If the put trades for $3.00 then this is a 0.3% return. If I can do this once per month then my potential annualized return is about 3.6%. As Shania Twain used to say, that don’t impress me much.

Portfolio margin—not suitable for a retirement account (see above)—makes the most sense to me for trading NPs.

Employing leverage by purchase of a long put is one alternative to make NPs more attractive for retirement accounts. In the previous example, if I buy the 900 put for $1.00 then I cut risk by 90%. Now I might be looking at a return of 2% per month or 24% per year. This is worth considering.

While purchase of the long significantly boosts potential ROI, it is not a panacea. The vertical spread does not affect maximum drawdown (DD) unless the market falls far enough to put the long put ITM. If the long put is purchased for cheap then this represents a significant market crash, which is rare. Similarly, the vertical spread does not decrease standard deviation of returns (another measure of risk as discussed here and here) unless that “significant market crash” occurs.

To illustrate, below is a risk graph of a naked put and a put vertical spread:

Naked put vs. put vertical risk graph (3-13-17)

The red arrows highlight how the vertical spread stops losing money by 419 on the downside (green line) whereas the NP continues to lose money as the market drops below 419 (brown line).

Other disadvantages to the vertical spread include the additional cost and transaction fees. Being two options instead of one, a vertical spread usually incurs twice the transaction fees as a NP. Based on my experience trading in fast-moving markets, I would expect to pay [much] more than 2x under these rare conditions. This makes sense to me because under these circumstances, the most efficient way for a market maker to survive is by taking the simplest trades and executing them quickly to serially mediate risk.

I will continue this discussion next time.

Leverage (Part 2)

I left off discussing the concept of leverage with regard to my previous backtesting. Today I will go one step further.

I believe maximum drawdown (DD) is as important a performance component as net income (also “total return”) because use of max DD to calculate position size can minimize risk of Ruin. If you don’t care about blowing up (i.e. Ruin) then it’s simply a matter of what can keep you from a good night’s sleep. DD is the answer here as well.

Position size is one of two ways leverage may be managed. Investment advisers assess risk tolerance in an attempt to help clients maintain a good night’s sleep. Account size and risk tolerance together viewed in terms of variable DD levels determine position size. This is not an exact science because maximum DD is only known in retrospect, which is why it’s called “investing” rather than just “winning.”

In Naked Put Study 2, maximum DD is 3.7x larger for long shares than for naked puts (NP). If I position sized the long shares properly to maintain that good night’s sleep then the NP position sizing could have been up to 3.7x larger without incurring a worse DD. This equates to net income 127% larger for NPs than for long shares.

Besides changing position size, the second way to manage leverage is to employ put credit spreads instead of NPs. I brainstormed this idea here and here.

The long put offsets “unlimited risk” by narrowing the width of the spread. If I sell a 1000 put then the potential loss is 1,000 points * $100/point = $100,000. If I also buy a 500 put for dirt cheap then my potential loss is only (1000 – 500) points * $100/point = $50,000. I halve my risk for only a slight decrease in net profit. Employing leverage in this way creates a cheaper trade with a similar potential return.

The benefit of buying long puts may be seen by equating the total risk. Suppose I sell a 1000 put for $3.00 and buy a 500 put for $0.30. I have sacrificed 10% of my potential return to halve my risk. If I traded two of these spreads then I have similar risk to the single naked 1000 put and my potential profit is $2.70 * 2 = $5.40 instead of $3.00. That is an increase of 80%.

I prefer some decrease in total risk when I employ leverage. Instead of selling two 1000 puts for $6.00 and incurring $200,000 risk, perhaps I sell three 1000/500 spreads and incur $150,000 risk while having a potential profit of $2.70 * 3 = $8.10. This is a 35% increase in profit potential with a 25% decrease in risk. I like that.

Leverage (Part 1)

When I think about the largest catastrophes ever attributable to options (arguably LTCM and the 2008 financial crisis, which involved an alphabet soup of derivatives), one word that sums up the root cause is “leverage.”

Leverage is important—not only when it comes to television but most assuredly when it comes to options. Investopedia defines leverage as: “the use of various financial instruments or borrowed capital, such as margin, to increase the potential return of an investment.” It goes on:

     > For example, say you have $1,000 to invest.
     > This amount could be invested in 10 shares of
     > Microsoft (MSFT) stock, but to increase leverage,
     > you could invest the $1,000 in five options
     > contracts. You would then control 500 shares
     > instead of just 10.

A cash account that does not allow trading on margin employs no leverage. The only way to “blow up,” or lose everything, is to invest the entire account and see the underlying assets (for long positions) go to zero. It’s very rare that stock prices go to zero (e.g. corporate bankruptcy). No broad-based (U.S.) index has ever gone to zero.

While leverage is exciting because upside exceeds 1:1, the same may occur on the downside resulting in a greater risk of blowing up. A leveraged account can go to zero long before the underlying assets do.

I have previously done research aiming to compare performance between long shares and naked puts (NP) while keeping leverage constant. This discussion can be seen here and here. I added $5M of risk each day and when I removed risk in one group, I removed the same amount of risk in the other.

The graphs shown here and here are particularly powerful. They show the NP strategy to generate a lower gross return and a much lower drawdown (DD).

While increasing leverage is effectively an increase in position size, position size can be too large without employing any leverage. Long shares purchased in cash accounts are not utilizing margin but the account can still blow up. In retrospect, the position size can always be said to have been too large. The minimum capital to trade a strategy is at least the maximum DD ever seen and the longer a backtest, the more likely the backtested max DD is to meet or exceed future market pullbacks. This certainly is not guaranteed and given a long enough trading horizon is not even likely.

I will continue next time.

Maximum Excursion Study

I previously did a study on maximum adverse excursion. Today I will discuss another study I did on maximum excursion (ME) in November 2015.

Excursions can be favorable or adverse. Maximum adverse (favorable) excursion is the largest loss (gain) during the lifetime of a trade. This is abbreviated MAE (MFE).

Although I selected a period of 23 trading days for this study, I should repeat the study over different periods to make sure the results are stable and not fluke. A period of 23 trading days corresponds to roughly one calendar month.

I used index prices for this study and looked at the MAE (downward price moves) and MFE (upward price moves) over the next 23 trading days. The study covered 3,681 data points from Jan 1, 2001, through Oct 29, 2015.

I stratified results by ventiles of a price oscillator (Osc). Osc reflects closing price as a percentage of the 23-day range ending today. Osc may range from 0 to 100. If the index closes today at a 23-day high (low) then today’s Osc reading will be 100 (0). Lower (higher) values of Osc correspond to oversold (overbought) market conditions.

Here are the averages (mean) and standard errors of the mean (SEM) for MAE:

RUT MAE x Price Oscillator (23-day period) graph

Here are the means and SEM’s for MFE:

RUT MFE x Price Oscillator (23-day period) graph

Both ME’s seem to decrease in magnitude as Osc becomes larger.* Furthermore, the variability (SEM) of the data seems to decrease as Osc increases with the exception of the first bar. Sample size is used to calculate SEM so I graphed it:

RUT ME x Price Oscillator (23-day period) sample size graph

A disproportionate number of occurrences take place at the extremes. Furthermore, twice as many occurrences take place at the high as the low. Sample size could therefore explain why the smallest error bar is seen at the right edge of the graph.

Standard deviation (SD) is a measure of variability that does not correct for sample size. Here is a graph of SD across all 20 ventiles of Osc:

RUT ME x Price Oscillator (23-day period) standard deviation graph

SD makes the case for an inversely proportional relationship between variability and Osc.

The smaller magnitudes of excursion as the market becomes more overbought corroborates the variability finding. Together these paint a picture of greater stability in bull markets.

This is not about trend-following vs. mean-reversion behavior. When the market is down, the larger up moves (MFE) suggest mean reversion but the larger down moves (MAE) suggest trending behavior. When the market is up, the smaller up moves suggest mean reversion but the smaller down moves suggest trending behavior. These are contradictory.

The findings make more sense from a volatility perspective. Implied volatility generally increases as the market sells off. This means larger moves are expected in either direction, which is just what we see.

* –For those interested, a single-factor ANOVA was highly significant for both MAE and MFE (p < 0.0001).