Bullish Iron Butterflies (Part 3)
Posted by Mark on August 28, 2017 at 06:50 | Last modified: May 30, 2017 14:47Previous posts (here, here, and here) have led me to think I need to redo this backtest with a lowering of transaction fees from $26 to $6/contract. Because that’s going to take months, I have been deliberating over what I might be able to do beforehand to salvage the data I already have. Today I want to focus on spread width.
I have a trade-off to consider when choosing the width of a butterfly trade. The wider the butterfly, the wider the breakevens and the greater the probability of profit. The breakeven widening is less than proportional to width while the total expense (margin requirement: MR) is directly proportional, though. Why spend so much more on a wider trade to get less of an increase in breakevens? Because in terms of ROI (a percentage), I am likely to suffer a smaller loss if the market does not go my way. In other words, the market will have to move more for me to suffer 100% loss on a wide butterfly than a narrow one.
Flying under the radar of the BIBF analysis to date is the fact that I have completely left cost (spread width) out of the discussion. Despite the absence of a critical detail, the analysis appears to stands on its own: anyone disagree?
That changes today. I will stratify performance by spread width to start:
Note the dramatic ROI improvement as width increases. This corroborates the statement above that narrower butterflies are at greater risk of suffering larger percentage losses.
I have two observations to make with regard to standard deviation (SD). First, more winners and losers (on either side of zero) should contribute to larger SD. This could explain the inverse relationship between SD and spread width. Second, I would expect SD to increase with small sample sizes. While sample size (# trades) also appears to be inversely proportional to spread width, I do have acceptable sample sizes up to the 60-70-point categories. I therefore would attribute this inverse relationship more to a greater winning percentage than to higher sample sizes.
The table once again illustrates the average trade of -16%, which corresponds to the 0.44 profit factor. These numbers, reported earlier, led me to say “I do not think this is an optimistic start!”
I can eliminate spread width as a variable by taking the gross margin requirement for the widest trade (100 points * $100/point = $10,000) and allocating that for each trade. By trading this way, I am using only 20% of my capital for a 20-point ($2,000) butterfly:
Instead of -16.18%, the average trade is now -4.22%. That is a big difference!
I will continue this discussion in the next post.
Categories: Backtesting | Comments (2) | PermalinkBullish Iron Butterflies (Part 2)
Posted by Mark on August 25, 2017 at 06:36 | Last modified: May 29, 2017 10:04With the wheels turning as a result of the last four posts (here, here, here, and here), I have decided to do an analysis of losers sorted by MFE.
I ran the analysis by simulating a reduction in transaction fees (TF) from $26/contract to either $11/contract or $6/contract. How many losers would otherwise be winners?
The average loss remained around 100%. The number of losses, however, decreased dramatically. Reducing TF to $11/contract and $6/contract cut the total number of losses by over 31% and over 57%, respectively.
As impressive as this may seem, the numbers are distorted because they are percentages of percentages. The losing trades are a small fraction of the whole. In order to estimate the overall impact, I counted the losers-turned-winners as +10% and adjusted the average loss downward per the numbers shown above. Here are the revised trade statistics:
For TF $6/contract, we’re now looking at a marginally (PF 1.14) profitable trade.
I believe an average loss being over seven times the average win does significant damage to the trade statistics and I can think of two ways to improve upon this going forward.
First, I can explore implementation of a stop-loss (SL). I need to look at MAE distributions of the winners to see if a SL makes sense. Winners with MAE worse than the SL will become losers and that will hurt.
Second, I can stratify performance by spread width. The statistics so far have hidden the fact that margin requirement varies dramatically across the collection of trades. Narrower butterflies have a lower probability of profit and if these amount to wasted margin then perhaps I would realize some benefit by making the narrower trades wider. Another alternative would be to trade narrower butterflies in smaller size and leave additional capital on the sidelines for dilution (e.g. a 100% loss might then be -50% although a 10% winner might then be +5%).
One important issue to address is whether I need to repeat the entire backtest with $6/contract TF’s before I proceed with the above analyses. I will sleep on that.
Categories: Backtesting | Comments (2) | PermalinkEnd-of-Day Versus Intraday Trading (Part 3)
Posted by Mark on August 22, 2017 at 07:21 | Last modified: May 26, 2017 13:15This blog series was supposed to be complete after two posts. However, my recent discussion on transaction fees feeds right back into it so I will briefly restate some ideas with the addition of something new.
One way to effectively eliminate slippage is to enter a good ’til cancelled (GTC) closing order for the profit target. With the exception of gaps, this should take me out at +10%. What gets obscured is the fact that this order won’t usually be executed until/unless the midprice goes above +10%. For example, when the order triggers at +10% the midprice may actually be +12%. This [slippage] increases trade duration, which is similar to the negative initial PnL due to transaction fees that I discussed in the last post. The saying “time is money” has never been more true.
Trading live using a GTC order would result in a greater percentage of winners at a lower average ROI. I detailed these points in the first two parts of this blog series. To quantify how many losers might become winners, I could sort the losers by MFE; MFEs falling just short of the profit target are good candidates to become winning trades if exposed to intraday price volatility.
In terms of “something new,” I recently considered tracking the second-largest adverse excursion (SLAE). One way to analyze potential stop-loss (SL) levels is to plot MAE vs. MFE. Of all the trades with an MAE beyond a threshold level, if only a few have MFE at/above the profit target then I incur minimal risk by using that level as a SL. Many times the SL would be triggered the day before MAE is reached. Collecting SLAE data would prevent me from having to go back and retest these losers.
The problem with this idea is that trade PnL will not necessarily be SLAE when using a SL. SLAE works in a particular instance where the market is trending and a trade would be stopped out for a smaller loss the day before it would otherwise reach MAE. The SL could be triggered any number of days before MAE would be otherwise reached, though, which renders SLAE useless. Also, in choppy markets the stop-out day and MAE day may be far away in time.
SLAE is an interesting idea but not one I will add to my backtesting spreadsheet. Collecting SLAE data would take a lot of time and I can easily imagine many losers still in need of retesting with the profile of intratrade PnL being so highly variable.
Categories: Backtesting | Comments (0) | PermalinkTransaction Fees and Backtesting (Part 2)
Posted by Mark on August 17, 2017 at 06:28 | Last modified: May 25, 2017 14:01My last post discussed reasons to cut back from my $26/contract transaction fee assessment. Today I want to finish up by discussing further implications of transaction fees.
A back-of-the-hand calculation suggests that if I cut transaction fees by Y then I can expect an average trade of X + Y where X was the average trade with transaction fees of $26/contract.
The relationship between transaction fees and PnL is more than linear, though. If I were to repeat the bullish iron butterfly (IBF) backtest with lower transaction fees then I could expect all the winners to remain winners. Days in trade would decrease, though, and this is the wild card.
Being in the trade for a shorter period of time reduces exposure to the IBF’s biggest enemy: big market moves. Some may argue IV spikes are the biggest enemy but the two are usually coincident.
Big market moves will damage prospects most for losing trades with highest MFE. MFE often occurs just before a big market move. In these cases, the move happens and the market never looks back thereby pushing these trades to max loss. Losing trades with MFE near the profit target have the best chance to become profitable given lower transaction fees. Confused? Consider the opposite extreme: losing trades with [lowest possible] MFE equal to initial PnL won’t have a chance regardless of transaction fees because these trades never get off the ground. For the IBF, initial PnL = -8 * transaction fees/contract.
Thought about differently, lower transaction fees means the initial PnL is greater, which means fewer days of theta decay required to reach the profit target. I could sort losing trades by MFE to approximate how many trades might benefit.
I need to make sure MFE is tracked correctly in order for this to be useful. I defined MFE as the highest intratrade PnL before expiration. I questioned this metric a couple times while backtesting. Once I suggested tracking MFE after profit target was hit could be useful. Another time I suggested tracking MFE before MAE was hit. If using stop-losses then it might be useful to know MFE before the stop-loss is hit (a better MFE afterward would be meaningless with the trade already closed).
All things considered, I think the MFE methodology is satisfactory given the need described above.
With the goal of cutting transaction fees significantly by being patient with trade entry, counting trades with MAE DTE equal to initial DTE will suggest what percentage of the time this could work. These would be the trades that recorded zero MAE (although as discussed in the last post, the opportunity would still exist to be filled intraday due to usual price volatility).
Categories: Backtesting | Comments (0) | PermalinkTransaction Fees and Backtesting (Part 1)
Posted by Mark on August 14, 2017 at 07:11 | Last modified: May 25, 2017 07:10I began analyzing the data from my bullish iron butterfly (BIBF) backtest in the last post. The initial results were surprisingly poor so I want to detour and reopen the discussion about transaction fees.
Conservative assessment of transaction fees contributed to making this trade look worse than it might actually be. I also discussed this with regard to the dynamic iron butterfly. I set transaction fees to $26/contract or $208 for the whole trade. The average trade was about -$150; if I were able to do the trade for $6/contract (a nickel slippage per leg) in transaction fees then the average trade would improve to +$10. Suddenly this trade would be worth considering.
A case can be made for significantly less slippage upon exit. I spoke last time about how live trading a losing butterfly would result in significant savings: perhaps knocking transaction fees down from $104 to $30. Another example regards a profitable trade. As expiration approaches, the long legs usually decay to zero. In many cases they are worth only a nickel or dime when the profit target is hit. Rather than closing these at a cost of $26/contract, I would let them expire and save the difference.
Leaving the longs to expire also offers some end-of-cycle crash insurance were the market to make a huge move as expiration approaches. With the shorts already closed at the profit target, the longs remain risk-free. My backtesting does not take this into account. In order to see if the detail is material, I could look for big differences in maximum adverse [favorable] excursion with 2-4 DTE and the expiration PnL to get a sense of how often big moves occur in the final week.
Trade entry is another opportunity to mitigate slippage. A limit order placed at the midprice is likely to be filled within a few trading days due to the usual fluctuation in market prices. Studying MAE distribution would help to quantify this. Any trade that registers a MAE larger than -$416 is effectively a zero-slippage trade.
I am most interested to see what percentage of trades has MAE less than -$416 because these are the ones that may not fill. The risk of going unable is actually lower, though, because opportunity exists for intraday drawdowns to occur that also represent zero-slippage entries. Intraday backtesting is very time intensive so the best way to understand this is through live trading. Even if I were to use OptionVue for intraday backtesting, it only offers limited data (every 30 minutes).
Categories: Backtesting | Comments (2) | PermalinkBullish Iron Butterflies (Part 1)
Posted by Mark on August 7, 2017 at 07:09 | Last modified: June 9, 2017 09:02Today I begin my report on the bullish iron butterfly.
The structure of this trade is different from the dynamic iron butterfly. This trade was centered 2-3% above the money (split strikes were used if necessary). Also, this was a balanced (e.g. symmetrical) butterfly. The last backtest only included a few balanced butterflies [when the dynamic criterion so ordered].
Trades were held until a 10% profit target was hit on an EOD (3:30 PM ET) basis or until expiration Thursday. I assessed my usual [arguably excessive] transaction fees.
Here are the initial results:
Aside from the ugly profit factor, the first thing I noticed was a max loss of -140.5%. In live trading, the worst loss I would incur on this trade would be a case where one spread goes DITM. OptionVue often shows these spreads to be worth more than their width. While this could potentially happen under low-volume, illiquid conditions, I would never actually close such a spread. Rather, I would hold it until expiration and pay two assignment fees. With this totaling $30 (or less) and the minimum margin requirement ever recorded of $400, the worst loss I could ever technically incur would be -108% ($430 / $400).
For this reason, I went back and changed the max loss on any trade to -108%:
The effect of the changes was minimal. 176 additional trades showed a loss between -108% and -115% but based on the minor impact from mitigating the most extreme losses, I don’t think it’s worthwhile going back to change the others.
I’m just getting started with this backtesting analysis but I do not think this is an optimistic start! Between 2001 and 2017, I backtested the bullish iron butterfly through many market environments and conditions. While I will separate some of these out and compare in an attempt to identify differences, part of me believes a robust trade should backtest profitably on the whole. This clearly did not.
Categories: Backtesting | Comments (2) | PermalinkHello!
Posted by Mark on August 1, 2017 at 07:53 | Last modified: May 22, 2017 14:41Not that you’d realize but I haven’t typed a blog post in a really long time!
Once I get going on a backtest I tend to get caught up in it. The 2-3 hours of backtesting I can handle daily represent an attention-demanding task that wears on my brain. It burns to the extent that I usually spend the balance doing more passive tasks like watching webinars or reading. Blogging—another concentration-intensive task—tends to get pushed aside.
Once the backtest is over I sometimes experience a complete loss of motivation. This describes the last couple weeks, which were punctuated by my seventh marathon. After the backtest is done I can finally proceed with the data analysis. This makes it somewhat inconvenient that my brain wants to take a vacation.
It is what it is.
I have now finished my second butterfly backtest. This time I looked at a classic butterfly. Data analysis is imminent.
Starting with today’s brief post, I will consolidate efforts to get back into the blogging groove.
I’m starting to feel butterflies!
Categories: About Me | Comments (0) | PermalinkMusings on Naked Puts in Retirement Accounts (Part 4)
Posted by Mark on July 27, 2017 at 07:05 | Last modified: October 14, 2017 07:21If a vertical spread lowers the standard deviation (SD) of returns and max drawdown (DD) compared to a naked put (NP) then its résumé is bolstered as an alternative candidate for retirement accounts. This was an unlikely result in the first example studied.
Rather than quintupling position size for the vertical spread, what if I double it? The potential return would be $4 (rather than $10), which is a 33% increase over the NP. The NP risk is now cut by 80% (rather than 50%) to $20K, which is the breakeven for both trades.
The risk graph now looks like this:
Like the previous example, the vertical spread outperforms if the market rises or if the market falls less than 7%. If the market falls between 7% and 20% then the naked put outperforms. If the market falls more than 20% then the vertical spread outperforms. A market correction over 20% is more likely than a market correction over 50% and this is where the risk metrics (SD of returns and max DD) would be improved by the vertical spread.
Of course, the vertical spread could be traded in equal position size to the NP. This would generate the first graph shown in Part 3. In that case, no gap of underperformance exists for the vertical spread and any market correction over 10% would generate better risk metrics for the vertical spread.
So going back to my statement in Part 1, does the vertical spread actually improve risk metrics?
The largest market crashes (e.g. fall 2008) will give rise to a lower SD of returns and a lower max DD for vertical spreads.
Unfortunately, these severe crashes occur so rarely that it’s hard to plan a trading strategy around them. The vertical spreads may or may not yield improved risk metrics depending on whether the market corrects, how often it corrects, and the exact magnitude of corrections during the time interval studied.
Compared to NP’s, vertical spreads may improve risk metrics. This is far from guaranteed.
Categories: Financial Literacy | Comments (0) | PermalinkMusings on Naked Puts in Retirement Accounts (Part 3)
Posted by Mark on July 24, 2017 at 06:27 | Last modified: April 5, 2017 13:53Today I resume discussion of vertical spreads instead of naked puts (NP) in retirement accounts. I mentioned previously that [OTM] vertical spreads don’t usually affect standard deviation (SD) of returns or maximum drawdown (DD). In case of a significant market downturn, however, they certainly can.
Let’s begin with the risk graph comparison posted earlier:
Notice how the green line (vertical spread at expiration) goes horizontal once the market drops ~16% to 419. That is where max loss is hit. The farther the market drops beyond that point, the more the NP (purple line) loses relative to the vertical spread. This represents a lower SD of returns and a lower max DD for the outperforming vertical spread.
This analysis assumes equal position size and offers an important distinction between ROI and gross PnL. In percentage terms, the vertical spread loses more than the NP if market falls the 16%: 100% for the vertical spread versus [under] 16% [buffered by initial premium collected] for the NP. In terms of gross dollars, the NP and vertical spread lose similar amounts until the vertical spread has lost 100% at which point the NP continues to lose more. 16% is the loss threshold beyond which the vertical spread delivers a lower SD of returns and a lower max DD than the NP.
Now let’s reconsider the naked 1000 put example I presented here. If I sell a 1000 put for $3.00 then [gross] risk is $100,000. If I buy the 900 put for $1.00 then I cut risk by 90%. I could, therefore, trade five times as many verticals while still halving the NP risk. Potential ROI on the vertical spread would be 6.7-fold greater.
Here are the risk graphs of the vertical spread (red line) and NP (blue line):
The problem with the vertical spread is the possibility of losing the entire $50K should the market fall from 1000 to 900 (10%). For the NP to lose $50K the market would have to fall to 500 (50%), which is circled in red. In this case, the vertical spread outperforms if the market rises or if the market falls less than 7%. If the market falls between 7% and 50% then the NP outperforms. If the market falls more than 50% then the vertical spread outperforms.
Because a fall over 50% is so unlikely, this particular vertical spread position would probably post a larger SD of returns and a larger max DD than the NP were the market to enter a meaningful correction.
In the next post I will compare a different vertical spread position to see how it measures up.
Categories: Financial Literacy | Comments (1) | PermalinkEnd-of-Day Versus Intraday Trading (Part 2)
Posted by Mark on July 21, 2017 at 06:24 | Last modified: March 28, 2017 14:58As I mentioned last time, a big part of the debate between end-of-day (EOD) and intraday trading involves the difference between the probabilities of touching and expiring. The markets are often regarded as random (Brownian motion). When a particular price level is reached, the market then has a 50/50 chance of moving higher or moving lower. The probability of expiring beyond that level is therefore less than the probability of touching it.
For intraday trading, this may be both an advantage and disadvantage. More winners can be exited intraday, which is an advantage. More losers—some of which would otherwise go on to be winners—will also be exited intraday, which is a disadvantage. On trend days, exiting losers (winners) intraday will avoid (preclude) what could otherwise be larger EOD losses (profits), which is an advantage (disadvantage).
This debate is not getting any easier.
Price action aside, another disadvantage to intraday trading is the need to be available and/or take action more than once and possibly whenever the market is open. This takes a lot of flexibility out of the workday.
The biggest disadvantage to intraday trading is arguably a much more complex (or impossible) backtesting proposition. OptionVue (OV) provides data every half hour. If I am going to “trade like I backtest” (mentioned here and here) then I must monitor trades every 30 minutes. Such backtesting would more than quintuple my current 2-5 months per backtest. Continuous market monitoring represents another magnitude of complexity because significant volatility can occur even between 30-minute prints. Backtesting this trading time frame would therefore require a much more granular database.*
As a net seller of option premium, I find time decay to be more certain than typical [random] price action. Every 24 hours an option gets one day closer to expiration. Implied volatility increase can offset time decay in the short-term but this only happens in some instances of down markets, which is [significantly] less than 50% of the time.
Given this additional reasoning, my gut instinct is to give the nod to EOD over intraday trading. A trade is more likely to be exited at an intraday stop-loss for the additional reason that option decay into the close may improve the PnL. Being directionally long also favors EOD trading by giving more time to allow for positive drift. The observations that many trades have small MAEs and only a select few have huge MAEs is additional evidence in favor of longer trade duration (EOD).
For me, the exponential complexity or impossibility of backtesting is the proverbial nail in the coffin for intraday trading. These restrictions actually make me wonder whether the perceived benefit of enhanced intraday opportunity is more illusion than anything else.
* – Any discretionary strategy that uses alerts to signal entries, exits, or adjustments implies this sort of intraday, continuous-monitoring approach.
Categories: Financial Literacy | Comments (1) | Permalink







