Automated Backtester Research Plan (Part 7)
Posted by Mark on December 20, 2018 at 06:05 | Last modified: November 9, 2018 15:45Once done with straddles and strangles, put credit spreads (PCS) are next in the automated backtester research plan.
The methodology is much the same for PCS as for naked puts, which I detailed here and here.
We can first study PCS placed every trading day to maximize sample size. Trades can be entered between 30-64 days to expiration (DTE). The short leg can be placed at the first strike under -0.10 to -0.50 delta by increments of -0.10. We can hold to expiration, manage winners at 25% (ATM options only?) or 50%, or exit at 7-21 DTE by increments of seven. We can also exit at 20-80% of the original DTE by increments of 15%. We can manage losers at 2x, 3x, 4x, and 5x initial credit. I’d like to track and plot maximum adverse (favorable) excursion (no management) for the winners (losers) along with final PnL and total number of trades. I want to monitor winning percentage, average win, average loss, largest loss, profit factor, average trade (average PnL), PnL per day, standard deviation of winning trades, standard deviation of losing trades, average days in trade (DIT), average DIT for winning trades, and average DIT for losing trades.
As always, I think maintenance of a constant position size is important. This is easier to do with vertical spreads because the width of the spread—to be held constant for each backtest—defines the risk. We can vary the width between 10-50 points by increments of 10 or 25-100 points by increments of 25 depending on underlying.
My gut says that we do not want long legs acting as unreactive units (standard options) at lower (higher) prices of the underlying. Rather than an apples-to-apples backtest throughout, this would actually be two different backtests with the long leg serving only as margin control at lower underlying prices and as an actual hedge otherwise. Unreactive units may result when the spread width is too large as a percentage of the underlying price: this percentage should be graphed over time. An alternative way of analyzing this is hedge ratio, which can also be graphed over time. Hedge ratio equals decay rate (theta divided by mark) of the short option divided by decay rate of the long. A hedge ratio less than 0.80 is suggestive of long option decay that is too rapid for the short. This may leave the short option unprotected.
The importance of this last paragraph is subject to debate. I alluded to the subject earlier where I cursorily addressed the feasibility of naked call backtesting altogether.
Shorter dated trades, which have not been discussed thus far in the research plan, may also be studied. I would be interested in studying trades placed at 4-30 DTE with all the permutations given above. We can also use weekly options [when available] subject to a liquidity filter. This filter can check for a minimum open interest requirement or a bid-ask spread below a specified percentage of the mark.
General filters can also be studied as discussed in Part 2 (linked in paragraph #2 above).
I will continue next time.
Categories: Backtesting | Comments (0) | PermalinkAutomated Backtester Research Plan (Part 6)
Posted by Mark on December 17, 2018 at 07:10 | Last modified: November 9, 2018 13:41In the last three posts, I detailed portfolio margin (PM) considerations with the automated backtester. After backtesting naked puts and naked calls separately, the next thing I want to do is add a naked call overlay to the naked puts.
This is not the previously-mentioned ATM call adjustment but rather study of 0.10-0.40- (by increments of 0.10) delta strangles. Strangles can be left to expiration, managed at 50%, or closed for loss at 2-5x initial credit. I want to track total number of trades, winning percentage, average PnL, average loss, largest loss, standard deviation of returns, days in trade, PnL per day, PF, RAR, and maximum adverse excursion. Strangles should be normalized for notional risk and with implementation of PM logic, notional risk can be replaced by theoretical loss from walking the chain up 15% (or up 12% plus a 10% vega adjustment). With this done, return on capital can then be calculated as return on PM (perhaps calculated as return on the largest PM ever seen in the trade since PMR varies from one day to the next). Maximum subsequent:initial PM ratio should be tracked. We can also study the effect of time stops.
If deemed useful then maximum favorable excursion (MFE) can also be studied for unmanaged trades. This could be studied and plotted in histogram format before looking at ranges of management levels (not mentioned in previous paragraph). With MFE and MAE, some thought may need to be given about whether to analyze in terms of dollars or percentages. If notional risk is somehow kept constant, though, then either may be acceptable.
Incorporating naked calls with filters can also be studied. Naked calls may or may not be part of the overall position at any given time. I am interested to study MAE by looking at underlying price change of different future periods given specified filter criteria. Any stable edges we identify could be higher probability entries for a naked call overlay. I approach this with some skepticism since it does imply market timing. As discussed in Part 3, this type of analylsis lends itself more to spreadsheet research than to the automated backtester, which would run simulated trades. We would primarily be studying histograms and running statistical analysis on distributions.
Backtesting of undefined risk strategies will conclude with naked straddles. Like strangles, straddles can be left to expiration, managed at 10-50% by increments of 5%, or closed for loss at 2-5x initial credit. I would want to monitor total number of trades, winning percentage, average PnL, average loss, largest loss, standard deviation of returns, days in trade, PnL per day, PF, RAR, and MAE (MFE?). The same comments given above for straddles regarding PM logic, return on PM, and PM ratios also apply here. We can also study the effect of time stops (managing early from 7-21 DTE by increments of seven).
As discussed with naked puts and calls, I would like to study rolling as a trade management tool. We can reposition strangles in the same (subject to a minimum DTE, perhaps) or following month back to the original delta values when a short strike gets tested or when down 2x-5x initial credit. We can do the same for straddles when an expiration breakeven (easily calculated) is tested as well as rolling just the profitable side to ATM.
Aside from studying straddles and strangles as daily trades, serial [non-overlapping] backtesting can be done in order to general equity curves and study relevant system metrics as discussed previously with regard to naked puts and naked calls.
Categories: Backtesting | Comments (0) | PermalinkPortfolio Margin Considerations with the Automated Backtester (Part 3)
Posted by Mark on December 14, 2018 at 07:03 | Last modified: November 9, 2018 11:00Today I continue discussion of portfolio margin (PM) [requirement (PMR)] and the automated backtester.
Please recall that I have described two research approaches. The first analyzes trades opened daily to collect statistics on the largest sample size possible. The second approach studies serial backtesting of non-overlapping trades to generate an account equity curve and to study things like maximum drawdown and risk-adjusted return. The latter lends itself to one sequence of trades out of an infinite number of potential permutations, which is suggestive of a Monte Carlo simulation.
I can definitely see a use for PMR calculations in the daily trades category. For each trade, the automated backtester could approximate PMR at trade inception and for each [subsequent] day in trade. To get a sense of how much capital would be required to maintain the position, we would want to track the maximum value of the subsequent/initial PMR ratio. The amount of capital needed to implement a trading strategy is at least [possibly much more if done conservatively as discussed here and here] the maximum value of the subsequent/initial PMR ratio observed across all backtested trades. In addition to this single number, I would be interested in seeing the ratio distribution of all trades plotted as a histogram and perhaps also as a graph with date on the x-axis, ratio on the left y-axis (scatter plot), and underlying price on the right y-axis (line graph).
PMR calculations might have a place in the serial trades category as well. Plotting equity curves of different allocation percentages is different from whether those portfolios could be maintained depending on max PMR relative to account value. If PMR exceeds account value, then at least some positions would have to be closed. Since it’s impossible to know which positions this would involve or even whether the broker would do it automatically (at random), I might assume a worst-case scenario where the account would be completely liquidated. On the graph, the equity curve would go horizontal at this point. With a consequence this drastic, I think PMR monitoring is worth doing.
In addition to PM, some brokerages have a concentration requirement. One brokerage, for example, looks at the account PnL with the underlying down 25%. The projected loss must be less than 3x the net liquidation value of the account. Violation of this criterion will result in a “concentration call,” which is akin to a margin call. An account can be in the former but not the latter if it holds DOTM positions that would (not) significantly change in value with the underlying down 25% (12%). Closing these options (typically for $0.30 or less) will often resolve the concentration call.
Building concentration logic might be useful for backtesting with filters. A large enough account could actually be traded by opening daily positions. Otherwise, implementation of filters could result in multiple open positions (albeit less than one new position per day). Stressing the whole portfolio by walking the chain up 25% would be useful because a strategy that looks good in backtesting but violates the concentration criterion is not viable. Put another way, I cannot project a 20% annual return on capital when the capital actually needed to maintain a strategy is quadruple (quite possible with PM) that projected. In this case, 5% annualized would be a more accurate estimate.
Categories: Backtesting | Comments (0) | PermalinkPortfolio Margin Considerations with the Automated Backtester (Part 2)
Posted by Mark on December 11, 2018 at 07:54 | Last modified: October 24, 2018 11:49Last time I started to explain portfolio margin (PM) and why a model is needed to calculate it.
I previously thought the automated backtester incapable of calculating/tracking PM requirement (PMR) without modeling equations [differential, such as Black-Scholes] and dedicated code, but this is not entirely correct. The database will have historical price data for all options. The automated backtester can simulate position value at different underlying price changes by “walking the chain.” In order to know the price of a particular option if the underlying were to instantaneously move X% higher (lower), I can look to the strike price that is X% lower (higher) than the current market price. Some rounding will have to be involved because +/- 3%, 6%, 9%, and 12% will not fall square on 10- or 25-point increment strikes that may be available in the data set (in highest liquidity and therefore most reliable prices).
The automated backtester would not be able to perfectly calculate PMR. In order to be perfect, the backtester would need to model the risk graph continuously on today’s date, which would require implementation of differential calculus. Rounding of the sort that I described above is not entirely precise. Also in order to be perfect, we would have to match the PM algorithm used by the brokerage(s). These are kept proprietary.
Another reason the automated backtester would not be able to perfectly calculate PMR is because walking the chain does not take into account implied volatility (IV) changes. [Some] brokerages also stress portfolios with increased IV changes to the downside when calculating PMR.
We can approximate the additional IV stress a couple different ways. First, instead of stressing up and down X% we could stress more to the downside. Second, we could use vega values from the data set in addition to walking the chain. Vega is the change in option price per 1% change in IV. If we want to simulate a 10% IV increase, then, we could add vega * 10 to short positions. This would probably not be exact because vega does not remain constant as IV changes. Vomma, a second-order greek that will not be included in the data set, is the change in vega per 1% increase in IV. The change in option price is actually the sum of X unequal terms in a series as defined by vega and vomma (along with third-order greeks and beyond to be absolutely precise).
Regardless of the imprecision, I think some PM estimate given by logic built into the automated backtester would be better than nothing. And my preference would always be to overestimate rather than underestimate PMR.
Categories: Backtesting | Comments (0) | PermalinkPortfolio Margin Considerations with the Automated Backtester (Part 1)
Posted by Mark on December 6, 2018 at 06:40 | Last modified: October 24, 2018 11:51I want to revisit something mentioned in Part 1 about portfolio margin (PM).
Allocation and margin are two separate things with regard to short premium trades and I have only been taking into account the former. I have mentioned allocation with regard to serial backtesting of [non-overlapping] trades. After further consideration, I think margin should be monitored because while we may be able to place a trade, whether we can maintain the position when the market goes sharply against us is a different story.
At some brokerages, accounts of sufficient size can qualify for portfolio (also termed “risk-based”) margin (PM). Reg T margin [which applies to cash, not margin, accounts] reduces buying power by the maximum potential loss at expiration for a given trade. PM uses an algorithm that analyzes profit and loss of the whole portfolio when stressed X% up and Y% down. In other words, if the underlying security were to increase (decrease) by X% (Y%) today (not at expiration), then the algorithm calculates the worst change in value across that range. Specifics vary by brokerage but as an example, the algorithm may calculate -12% and +12% by increments of 3%. The maximum loss at any increment is the portfolio margin requirement (PMR). I will not incur a margin call provided PMR is less than the net liquidation value of my account.
Calculating PMR requires modeling of the cumulative position. A permanent component of the option pricing equation is implied volatility (IV). IV may be understood as the relative supply/demand for an option. This is inherently unknown, which is why a model is necessary.
As an example to explain this price uncertainty, suppose I am an institutional option trader looking to allocate $50 billion to a specific short premium position. The sooner I get this done, the sooner I have the opportunity to start making daily profit. Once the funds clear, I want to be in regardless of whether the market is up, down, a little or a lot.* You can be sure my $50 billion is going to move some markets by making purchased (sold) options more (less) expensive along with a coincident IV increase (decrease). This is the principle of supply and demand that, in this case, has nothing to do with underlying market move: simply when the bank/brokerage clears my funds for trading. For this and countless other reasons having nothing to do with market movement, unpredictable purchases/sales regularly occur—perhaps in smaller dollar amounts but the aggregate effects can be imagined to be similar.
I will continue next time.
>
* I may avoid “a lot” if liquidity dries up or bid/ask spreads become large.
Automated Backtester Research Plan (Part 5)
Posted by Mark on December 3, 2018 at 06:06 | Last modified: November 9, 2018 13:30Today I will finish up the automated backtester research plan for naked calls.
Once done studying daily [overlapping] trade statistics, we can repeat the naked put analysis with a serial trading strategy for naked calls. This involves one open trade at a time. We can look at number of trades, winning percentage, compound annualized growth rate (CAGR), maximum drawdown, risk-adjusted return, and profit factor. Again, equity curves will represent just one potential sequence of trades and some consideration can be given to Monte Carlo simulation. We can plot equity curves for different account allocations such as 10% to 70% of initial account value by increments of 10%.
With both overlapping (daily) and non-overlapping (serial) trades, position size should be held constant to allow for an apples-to-apples comparison of drawdowns throughout the backtesting interval. With naked puts, position size is notional risk. Naked calls, though, have unlimited notional risk. Maybe we deduct 0.05-0.20 for the naked call premium under the assumption that we always purchase the lowest strike call available for minimal price to limit margin.
This would result in a vertical spread, though, and the width would be different depending on underlying price.
Does this compromise the feasibility of naked call backtesting altogether? If calls must be done as vertical spreads then buying the long leg for minimal premium will be different from most call credit spread studies to be done for widths 10 (25) to 50 (100) points wide by increments of 10 (25)—except for very low underlying prices where the larger widths may result in the same minimally-priced long being purchased. The naked call study has then become a call credit spread study, which overlaps with the vertical spread backtesting to be detailed later. This deserves further deliberation.
We can apply the same rolling ideas to naked calls as we did to naked puts. We can roll naked calls [up and] out to the next month when a short strike is tested or when the trade is down 2-5x initial credit. We can also roll naked puts up to same original delta in the same or next month if strike gets tested.
When studying filters, it will be important to look at total number (and distribution) of trades along with equity curve slopes to determine consistency of profit. Risk-adjusted return and profit factor should also be monitored.
Naked call filters for study are similar to those for naked puts. We can look at trades taken at 5-20-day highs (lows) by increments of five. Trades can be taken only when a short-term MA is above (below) a longer-term MA. As mentioned in the Part 2 footnote, my preference would be to avoid getting overly concerned with period optimization, but this may be unavoidable. Implied volatility (IV) filters may include trades taken with IV at an X-day high (low), on the first down day for IV after being up for two consecutive days, or with IVR above 25/50.
I am curious to find out if naked calls can add to total return and/or lower standard deviation of returns.
Next time I will revisit margin considerations.
Categories: Backtesting | Comments (0) | PermalinkAutomated Backtester Research Plan (Part 4)
Posted by Mark on November 30, 2018 at 07:02 | Last modified: October 17, 2018 09:31Today I continue with the research plan for naked calls.
As discussed with puts, I think naked call trades should be normalized for notional risk.
I would like to see a distribution of naked call losers in time and in magnitude. Date can be on the x-axis with underlying closing price (line graph) on the right y-axis and trade PnL (histogram) on the left y-axis. It certainly makes sense to do these graphs for expiration. We can also do these graphs for managing winners at 50% (and/or 25%?) and for managing losers.
I suggested managing trades early (i.e. exiting at 7-21 DTE by increments of seven or exiting at 20-80% initial DTE by increments of 15%) for naked puts, but I did not mention it for calls. This is because in backtesting naked calls down to 7 DTE, I am not sure what kind of time stop makes sense. Four, three, and two DTE correspond to expiration Monday, Tuesday, and Wednesday respectively—any of which would seem to be an extremely short time to hold these trades. They could be repeated every week, though. This is subject for debate.
When the market rips higher, naked calls can lose quickly because they are closer to the money. This almost makes me more reluctant to trade naked calls than puts, which is counterintuitive because traditional wisdom says naked puts are most at risk. Naked puts are vulnerable to directional moves—equity markets tend to crash down farther (and faster) than they crash up—and extremely vulnerable to volatility explosion. If volatility affects naked calls at all on strong upside moves then it generally benefits them (going from inflated IV on a pullback to normal/low IV after the rebound). The culprit hiding in the shadows is vertical skew, which makes OTM calls cheap compared to OTM puts.
This line of discussion makes me curious to know how time stops can reduce risk of naked calls despite the above discussion of why they were not mentioned in the previous post. I would be interested in seeing a histogram of PnL (y-axis) by DTE (DIT): high (low) to low (high) moving from left to right along the x-axis. This plot would be for unmanaged trades. I would expect to find that earlier exits mitigate the most extreme losses—but at what cost?
The vertical skew discussion also implies that [if implemented then] naked calls should be traded in a smaller position size than naked puts. I would like the backtest to provide some insight about reasonable position sizing. I want to study the rolling (out or up and out) adjustment and how many rolls have been historically required during the sharpest and most sustained upside moves. As an example of how this could be relevant, suppose risk management is to roll into double the position size when premium increases by 100%. If we don’t think this will happen for more than three consecutive months, then maybe position size for naked calls should be 13% (or less) that of naked puts.
I will continue next time.
Categories: Backtesting | Comments (0) | PermalinkAutomated Backtester Research Plan (Part 3)
Posted by Mark on November 27, 2018 at 06:43 | Last modified: November 9, 2018 13:26Finishing up the discussion on filters, some can probably be tested on the underlying alone without the automated backtester. By looking just at underlying price we can plot trade distribution (looking for consistent vs. lumpy). Maximum adverse excursion can also be studied to see whether this improves with filter application. This type of analysis may lend itself more to spreadsheet work and macros. I have numerous research questions that would fit in this category.
Back to the automated backtester, I would like to study rolling as a trade management tool. We can roll naked puts [down and] out to the next month when a short strike is tested or when the trade is down 2-5x initial credit. We can also roll naked puts down to same original delta in the same or next month if strike gets tested.
Some thought may need to be given to calculate days in trade (DIT) for rolling adjustments. If rolling out doubles DIT, for example, then annualized ROI is halved. This may not be the best result. If I’m not looking to calculate [annualized] ROI then this may be a moot point, but we should be aware that for breakeven or normal profit, rolling will significantly increase DIT.
As an overlay, another adjustment I am interested in testing is the addition of an ATM short call to manage NPD.
Parts 1 and 2 of this research plan primarily addressed naked puts. The plan is similar for naked calls.
The first phase of naked call backtesting involves overlapping trades. We can study trades entered every day between 7-42 DTE. We can choose the first strike under 0.10 to 0.50 delta by increments of 0.10. We can hold to expiration or manage winners at 25% (ATM options only?) or 50%. We can manage losers at 2x, 3x, 4x, and 5x initial credit. I’d like to track and plot maximum adverse (favorable) excursion (no management) for the winners (losers) along with final PnL and total number of trades. I want to monitor winning percentage, average win, average loss, largest loss, profit factor, average trade (PnL), PnL per day, standard deviation of winning trades, standard deviation of losing trades, average DIT, average DIT for winning trades, and average DIT for losing trades.
My gut leans away from studying longer-term naked calls because of vertical index skew. With the market generally drifting higher and naked calls being cheaper than put counterparts (thereby implying NTM call sales for equivalent premium to farther OTM puts), my bias is toward shorter-term holdings. On the other hand, a 30-64 DTE backtest would allow for an apples-to-apples naked put comparison. This is subject for debate.
I will continue next time.
Categories: Backtesting | Comments (0) | PermalinkAutomated Backtester Research Plan (Part 2)
Posted by Mark on November 22, 2018 at 07:22 | Last modified: November 9, 2018 13:32Last time I discussed backtesting naked puts by opening one trade every day.
A final piece to managing winners is total number of trades. In a serial scenario, total trades would be greater for managing winners than holding to expiration whereas in a daily/overlapping trade scenario, total trades would be equal despite average daily notional risk being less for managing winners. It might make sense to track daily notional risk as a proxy for actual buying power reduction, which would be significantly less in a [portfolio] margin account and perhaps too complex (or not worth the effort) to build into the automated backtester.
The research plan continues with backtesting naked puts in a serial manner by having only one trade open at a time.
For the serial approach, I would like to tabulate several different statistics. These include total number of trades, winning percentage, compound annualized growth rate (CAGR), maximum drawdown, risk-adjusted return (RAR), and profit factor (PF). Equity curves will represent just one potential sequence of trades and some consideration could be given to Monte Carlo simulation. We can plot equity curves for different account allocations such as 10% to 70% of initial account value by increments of 5% or 10% for a $50M account. A 30% allocation (for example) would then be $15M per trade. Trade size should be held constant throughout in order to maintain apples-to-apples comparison of drawdowns throughout the backtesting interval.
The general principle behind filters is to achieve more profit (PnL per trade—sometimes as a result of decreasing drawdown or, in this case, a higher winning percentage) despite fewer trades. My preference is not to see a lumpy equity curve where a vast majority of trades occur on a small percentage of days. This gets away from trading as a business to pay the monthly living expenses. When studying filters, it will therefore be important to look at number of trades and the slope of the equity curves under different filters to determine consistency of profit. RAR and PF will also be useful.
Examples of filters to be tested are numerous. We can look at trades taken at 5-20-day highs (lows) by increments of five. Trades can be taken only when a short-term MA is above (below) a longer-term MA.* Trades can be avoided when the underlying is under the 20-, 50-, or 200-day MA. IV at an X-day high may be a useful inclusion or exclusion filter (always minding sufficient sample size at extreme parameter values). Trade entry can be filtered by IV rank (perhaps 25% or 50% with a period of 30, 180, or 365 days). A volatility stop could be implemented to exit losing trades if IV increases by 30-100% using increments of 10%.
I will continue next time.
>
* Some thought would have to be given to period determination. I do not want to get into an extensive optimization game>
since I’m more a believer in Occam’s Razor (i.e. K.I.S.S.).
Automated Backtester Research Plan (Part 1)
Posted by Mark on November 19, 2018 at 05:46 | Last modified: October 17, 2018 07:54Today I begin outlining a research plan for the automated backtester.
I want to start with naked puts because they employ the least leverage.
We can study trades entered every day between 30-64 DTE. We can choose the first strike under -0.10 to -0.50 delta by increments of -0.10. We can hold to expiration, manage winners at 25% (ATM options only?) or 50%, or exit at 7-21 DTE by increments of seven. We can also exit at 20-80% of the original DTE by increments of 15%. We can manage losers at 2x, 3x, 4x, and 5x initial credit. I’d like to track and plot maximum adverse (favorable) excursion (no management) for the winners (losers) along with final PnL and total number of trades. I want to monitor winning percentage, average win, average loss, largest loss, profit factor, average trade (average PnL), PnL per day, standard deviation of winning trades, standard deviation of losing trades, average days in trade (DIT), average DIT for winning trades, and average DIT for losing trades.
Return on investment (ROI) does not seem relevant for naked puts because of the large notional risk. At the moment, I cannot think of a need to track buying power reduction, but this is something I will keep in mind.
Speaking of notional risk, unless normalized the average win/loss can vary significantly based on underlying price (and option prices). We can apply a fixed position size (e.g. $5M) and calculate number of contracts for each trade. If I am selling a 1500 put, for example, then $5M divided by $150,000 (notional risk) is 33 contracts ($4,950,000) and change (truncate). If I sell a 1000 put then 50 contracts would amount to $5M notional risk. Regardless of underlying price, this will give a variable number of contracts to keep notional risk relatively constant thereby keeping profits and losses commensurate.
If we don’t normalize for notional risk then we would get numbers that don’t make as much sense. With the underlying at 1000 vs. 2000, for example, the contribution to the total PnL would be roughly twice as large at the higher prices. The overall contribution should not significantly vary based on an arbitrary factor.
I want to briefly discuss the relative constancy around target position size. I mentioned that $4,950,000 is 1% less than $5,000,000. As discussed here, I would expect this error to be inversely proportional to number of contracts because the percentage difference between consecutively decreasing integers increases (e.g. 19 is 5% lower than 20 whereas 9 is 10% lower than 10). If we deem this error to be too large—especially for lower-priced underlyings like RUT—then the target position size can be increased (e.g. from $5M to $10M).
I would like to see a distribution of losers in time and in magnitude. Date can be on the x-axis with underlying closing price (line graph) on the right y-axis and trade PnL (histogram) on the left y-axis. It certainly makes sense to do these graphs for expiration. We can also do the graphs for managing winners at 50%. I think it also makes sense to do these graphs for managing early (e.g. 7-21 DTE or X% of the original DTE) as well as managing losers.
I will continue next time.
Categories: Backtesting | Comments (0) | Permalink