Option FanaticOptions, stock, futures, and system trading, backtesting, money management, and much more!

Time Spread Backtesting 2022 Q1 (Part 1)

Ironically, while developing a Python backtester over the last few months (e.g. here, here, and here), I have completely gotten away from time spread backtesting. Today, I will revisit the manual backtesting realm by looking at time spreads in the first three months of 2022.

As seen in previous posts on the subject (e.g. here, here, and here), time spreads may be approached in a variety of ways. In the current mini-series, I will address a number of different details and tweaks. Rather than get confused, distracted, and drawn off course by manually backtesting one at a time, my ultimate hope for the Python backtester is to be able to algorithmically run through a large sample size of each variant and compare pros versus cons.

For now, my base strategy is as follows:

With SPX at 4799, the first trade begins on 1/4/22 at the 4800 strike for $6,688: TD 20, IV 10.6%, horizontal skew -1.1%, NPV 291, and theta 15.6.

The very next trading day, a 2.57 SD move down brings us to the first adjustment point with PnL -7%:

time spread backtesting 2022 Q1 image 1 (6-20-22)

A 2.10 SD move down eight days later brings us to the second adjustment point with PnL -15%:

time spread backtesting 2022 Q1 image 2 (6-20-22)

Max loss is hit five days later on a 2.44 SD move lower:

time spread backtesting 2022 Q1 image 3 (6-20-22)

SPX cratering 2.31 SD in 14 days has resulted in a loss of 20.4%. Tough to overcome that! Although horizontal skew increases, it still remains negative while IV spikes ~90%. This suggests IV increase as a partial hedge in this trade.

I will continue next time.

Resolving Dates on the X-Axis (Part 2)

Today I conclude with my solution for resolving dates as x-axis tick labels.

I think part of the confusion is that to this point, the x-coordinates of the points being plotted are equal to the x-axis tick labels. This need not be the case, though, and is really not even desired. I want to leave the tick labels as datetime so matplotlib can automatically scale it. This should also allow matplotlib to plot the x-values in the proper place.

Documentation on plt.xticks() reads:

plt dot xticks documentation (6-14-22)

The first segment suggests I can define the tick locations and tick labels with the first two arguments. For now, those are identical. Adding c as the first two arguments in L11 (see Part 1) gives this:

output of code snippet 12 (6-14-22)

Ah ha! Can I now insert a subset as a different time range for the x-coordinates?

output of code snippet 13 (6-14-22)

I think we’re onto something! I commented out the print lines in the interest of space.

Finally, let’s reformat the x-axis labels to something more readable and verify datatype:

Code snippet 14 (6-14-22)

Success! I am able to eliminate hours, minutes, and seconds. Interestingly, the axis labels now show up as string but matplotlib is still able to understand their values and plot the points correctly (I suspect the latter takes place before the former). Changing the date range on the axis helps because this graph should look different from the previous one.

To put in more object-oriented language:

Code snippet 15 (6-17-22)

I suspect the confusion between the plt and fig, ax approaches is widespread. For a better explanation, see here or here.

Resolving Dates on the X-Axis (Part 1)

Having previously discussed how to use np.linspace() to get evenly-spaced x-axis labels, my final challenge for this episode of “better understanding matplotlib’s plotting capability” is to do something similar with datetimes.

This will be a generalization of what I discussed in the last post and as mentioned in the fourth paragraph, articulation of exactly what I am trying to achieve is of the utmost importance.

I begin with the following code and a new method pd.date_range():

Code snippet 11 (6-14-22)

L5 generates a datetime index that I can convert to a list using the list() constructor (see output just above graph). Each element of the subsequent list is datatype pd.Timestamp, which is the pandas replacement for the Python datetime.datetime object. Observe that the first and second arguments are start date and end date, which are included in the Timestamp sequence. Also notice that the list has five elements, which is consistent with the third argument of pd.date_range().

Given a start date, end date, and n labels, this suggests I can generate (n – 1) evenly-spaced time intervals. Great start!

The enthusiasm fades when looking down at the graph, however. First, I get nine instead of five tick labels. Second, my desired format is yyyy-mm-dd as contained in L5. I do not know how/where the program makes either determination.

Another problem is that if I change the third argument (L5) to 15 to get more tick labels, a ValueError results: “x and y must have same first dimension, but have shapes (15,) and (5,).” That makes sense because I now have an unequal number of x- and y-coordinates. This date_range is really intended to be used only for tick labels and not as the source of x-coordinates. I may need to create a separate date_range (or make another list of x-coordinates) for plt.plot() and then create something customizable for evenly-spaced datetime tick labels.

I will continue next time.

Resolving the X-Axis (Part 2)

I left off last time with a promising solution for setting x-axis labels using the Matplotlib.Ticker.FixedLocator Class. Unfortunately, the example at the bottom shows this doesn’t work for all values, which calls the solution into question.

What’s going on? Take a look at the following code snippet:

Zip code snippet 10 (6-9-22)

This shows for equally-spaced tick labels having integer coordinates, only certain numbers of labels are possible: 2, 3, 4, 5, 7, 10, and 20. I did not get six because it’s not mathematically possible. The same holds true for 8-9 and 11-19. When multiple equally-spaced lists are possible, I was really aiming for the one with the last element closest to the final date in the list.

In order to code this stuff accurately, I need to articulate exactly what I’m trying to achieve. I failed to do that.

Aside from the FixedLocator Class, another way to approach this is with np.linspace(a, b, c). This automatically creates a linear space of c-point subdivisions between a and b inclusive (i.e. a and b always included as the first and last values):

numpy linspace example (6-9-22)

Note how each list begins and ends with 0 (a) and 19 (b), respectively.

How do the plots look with different numbers of x-axis labels?

Graphing subplots with different number of x-axis labels by loop (6-9-22)

In the interest of space, I will describe rather than show the output. We get 20 subplots where the number of tick labels increases from zero to 19 by an increment of one for each subplot. The graphs are identical—the only thing that changes is the number of equally-spaced tick labels. Outstanding!

Some highlights of this code are as follows:

I’m quite happy with the progress made here!

Resolving the X-Axis (Part 1)

As it turns out (see here and here), some of the matplotlib debugging came down to better understanding the zip() method. I still have some further considerations to resolve.

I would like to enlarge the graph so the axis isn’t so crowded when every label is included.

First though, I want the x-axis tick labels and locations to be handled automatically. I want z labels spaced evenly throughout the time interval from first Friday to last Friday. Alternatively, I may want to try plotting labels only where new trades begin.

When left to plot the x-axis tick labels automatically, others were seeing consistent tick labels on the 1st and 15th of each month as discussed in the third paragraph of Part 7. That would be acceptable, but for some unknown reason, I got asymmetric labels on the 1st and 22nd of each month as shown near the bottom here.

I stumbled upon the Matplotlib.ticker.FixedLocator Class, which is seen in L10 below:

Zip code snippet 9 (6-6-22)

The highlighted number is the number of tick labels that I expect to see. I determined this by trial and error (it requires the minus one). I want constant spacing across these labels and eventually, I’d like the program to calculate the optimal number.

Let’s break this down to see how it works (or not):

     > [x for x in range(len(a)) if x%((len(a)-1)//(5-1))== 0]

This is a pretty complicated piece of code for a beginner (me). First, we have to recognize it as a list comprehension: it will generate a list. A list will direct the program to place tick labels at specified locations as shown just above the first graph here.

The list will be generated as follows:

If I populate the highlighted number as 1, then I’ll get division by zero (not good). I’d never want just one tick label anyway. Two works along with 3, 4, and 5.

What about 6?

Problem defining 6 tick labels (6-6-22)

I count seven tick labels.

Houston, we have a problem.

Understanding the Python Zip() Method (Part 2)

Zip() returns an iterator. Last time, I discussed how elements may be unpacked by looping over the iterator. Today I will discuss element unpacking through assignment.

As shown in case [40] below, without the for loop each tuple may be assigned to a variable:

Zip code snippet 6 (5-31-22)

[37] shows that when assigned to one variable, the zip method transfers a zip object. Trying to assign to two or three variables does not work because zip(a, b, c) contains four tuples. As just mentioned, [40] works and if I print yp, m, and n, the other three tuples can be seen:

Zip code snippet 7 (5-31-22)

I got one response that reads:

     > But since you hand your zip iterables that all have 4 elements, your
     > zip iterator will also have 4 elements.

Regardless of the number of variables on the left, on the right I am handing zip three iterables with four elements each.

     > This means if you try to assign it to (xp, yp, m), it will complain
     > that 4 elements can’t fit into 3 variables.

This holds true for three and two variables as shown in [39] and [38], respectively, but not for one variable ([37]). Why?

Maybe it would help to press forward with [37]:

Zip code snippet 8 (6-3-22)

If assigned to one variable, the zip() object still needs to be unpacked (which may also be accomplished with a for loop). If assigned to four variables, each variable receives one 3-element tuple at once.

In figuring this out, I was missing the intermediate step in the [un]packing. zip(a, b, c) produces this series:

     (‘1-6-2017’, 265, ‘d’), (‘1-13-2017’, -10, ”), (‘1-20-2017’, 130, ‘d’), (‘1-27-2017’, 330, ”)
     or
     (a0, b0, c0), (a1, b1, c1), (a2, b2, c2), (a3, b3, c3)

xp, yp, m = zip(a, b, c) tries to unpack that series of four tuples into three variables. This does not fit and a ValueError results.

for xp, yp, m in zip(a, b, c) unpacks one tuple (ax, bx, cx) at a time into xp, yp and m.

Despite my confusion (I’m not alone as a Python beginner), zip() is always working the same. The difference is what gets unpacked: an entire sequence or one iteration of a sequence. zip(a, b, c) always generates a sequence of tuples (ax, bx, cx).

When unpacking in a for loop, one iteration of the sequence—a tuple—gets unpacked:

     xp, yp, m = (ax, bx, cx)

When unpacking outside a for loop, the entire sequence gets unpacked:

     xp, yp, m, n = ((a0, b0, c0), (a1, b1, c1), (a2, b2, c2), (a3, b3, c3))

Understanding the Python Zip() Method (Part 1)

As promised at the end of my last post, I’ve done some digging with some extremely helpful people at Python.org. Today I will work to wrap up loose ends mainly by discussing the Python zip() method.

My first burning question (Part 8) asks why L42 plots a line whereas L45 plots a point. The best answer I received says that matplotlib draws lines between points. If you give it X points then it will draw (X – 1) lines connecting those points. I was pretty much correct in realizing L45 receives one point at a time and therefore draws (1 – 1) = 0 lines.

To understand how L45 gets points, I need to better comprehend the zip() method. Zip() returns an iterator. Elements may then be unpacked via looping or through assignment.

Let’s look at the following examples to study the looping approach.

Unpacking to one variable (xp) outputs a tuple with each loop:

Zip code snippet 1 (5-31-22)

Unpacking to two variables (xp, yp) does not work:

Zip code snippet 2 (5-31-22)

“Too many values to unpack” is confusing to me. If there are too many values to unpack for two variables, then why are there not too many to unpack for one? Perhaps the first example should be conceptualized as one sequence with four tuples. If so, then can’t this be conceptualized as one sequence with two tuples unpacked through two loops each?

Looping over the iterator with three variables yields this:

Zip code snippet 3 (5-31-22)

To better illustrate how the value from a gets assigned to xp, the value from b gets assigned to yp, and the value from c gets assigned to m, here is the same example with all variables printed:

Zip code snippet 4 (5-31-22)

Unlike the top example, these are not tuples as no parentheses appear. Each line is just three values with spaces in between.

Looping over the iterator with four variables does not work:

Zip code snippet 5 (5-31-22)

I understand why four were expected (xp, yp, m, n) and as shown in the previous example, only three lists are available to be unpacked up to a maximum of four times.

Next time, I will continue with examples of element unpacking through assignment.

Debugging Matplotlib (Part 8)

Getting back to the objectives laid out here, I completed #1 in Part 4, #2-3 in Part 5, and #5 in Part 6. I will resume with objective #4: randomly select five Fridays as trade entries.

This line is pretty straightforward:

Code Snippet 7 (5-26-22)

Finally, this snippet allows me to conquer objective #6:

Code Snippet 8 (5-26-22)

This is actually somewhat complex code for a beginner like me. I will go over a few points.

First, note that I have simplified the graph from two subplots to just one. The reason for including two subplots earlier was only to compare tick labels on the x-axis.

Second, look at the syntax of L45. The arguments are x-values, y-values, marker code, color, and markersize. L42 is an abbreviated version with just the first two arguments. L45 plots the markers while L42 plots the line. How does this work?

In L42, the arguments are datatype list.

In L45, the datatype is more complicated. The first three arguments of L45 are generated in L44 from a zip function. From W3Schools.com:

     > The zip() function returns a zip object, which is an iterator of tuples where
     > the first item in each passed iterator is paired together, and then the second
     > item in each passed iterator are paired together etc.

The zip function itself produces a zip object. Trying to directly unpack the object into variables does not work:

Code Snippet 9 (5-26-22)

I’m still trying to understand what the “too many values” are. I would expect to get a list of (xp, yp, m) tuples from this.

As it turns out, I can get such a list with the list constructor:

Code Snippet 10 (5-26-22)

Like the list constructor, the for loop is an iterator that goes over the iterable until nothing is left. Each time, it unpacks three values from the zip object: one from each list. These then get presented to L45 as the x-value, y-value, and marker code. This plots a set of points showing up as diamond markers or blank instead of a continuous line because each time three separate values are presented rather than two lists being presented at once? It’s hard for me to articulate this, which suggests that I don’t fully understand it yet.

Next time, I will do a bit more digging in order to explain this better.

In the meantime, mission accomplished for all six objectives!

Debugging Matplotlib (Part 7)

I will pick up today by discussing why the x-axis labels are different for the lower subplots presented in Part 6.

To clarify some terminology, I have been saying “x-axis labels,” which I think is adequately descriptive and perhaps even correct. In different online forums, I have seen mentions of “tick labels” and “tick locations.” The 1st and 22nd of each month are tick locations on a date axis. The tick labels are what get printed at those locations. For dates on a date axis, tick locations and tick labels are identical.

The best answer I received to the original question says that matplotlib (MPL) is probably doing with dates what it does with numbers: calculating evenly-sized intervals to fit the plot (based on first and last values). He reports tick locations at the 1st and 15th of each month, though, which makes more sense as “evenly-sized.” The 21 days followed by 7-10 days I get at the 1st and 22nd of each month are lopsided. Although I still lack explanation for the latter, I did find this SO post showing the same thing (no explanation given there, either).

With regard to this line:

     > converted_Fri_2017 = [d.strftime(‘%Y-%m-%d’) for d in Fri_2017] #list comprehension

Values lose meaning when converted to strings. MPL spaces strings evenly without regard to any numeric or date value.

String conversion works in this instance because tick locations = tick labels, but other cases could present problems. One such case would be non-fixed-interval trade entry dates. Another example would be a longer time horizon where too many tick labels may render the x-axis illegible. If left as dates (or datetimes: both worked the same for me) then MPL could potentially scale accordingly (see first sentence of paragraph #3, above), but converting to strings robs MPL of this opportunity.

Much functionality remains with regard to ax.xaxis.set_ticks(), ax.set_xlim(), ax.set_xticks(), ax.set_xticklabels(), ax.tick_params(), plt.setp(), AutoDateLocator, ax.xaxis.set_major_locator(MultipleLocator()) from Part 3, etc. The list goes on, and solutions are varied based on version. That is to say they may have worked when posted, but if subsequent versions have been released (especially with previous functionality deprecated), those solutions may no longer be suitable.

I do not plan to write an encyclopedia of all the available functionality. I will resort to picking and choosing based on any particular needs I have at a given time.

Debugging Matplotlib (Part 6)

I left off with a seemingly counterintuitive situation where plt.xticks() either effects something yet to be generated or gets undone by something later in the program. After completing that last post, though, I had a shocking realization: I THINK I KNOW THE ANSWER AS A RESULT OF MATPLOTLIB EXCEPTIONS HAVING BEEN RAISED IN MY PAST WORK!

Exceptions are usually frustrating because they force me to problem solve something I inadvertently did wrong. Now, that past frustration proves quite beneficial in leaving the indelible image in my mind of a completely blank graph.

Let me simply the code to include only the imported modules and the first graphing line:

Code Snippet 5 (5-19-22)

I completely erred in my reasoning throughout the last four paragraphs of Part 5. Neither L3 nor L6 draws any axes. All axes are generated in L1 and this includes the “last [second set of] axes.” L4 and L7 both operate on the second set of axes defined in L1, which is why only the x-axis labels of the lower graph were rotated.

This makes more sense. There is no retroactive operation and no need to hold a command in memory for something not yet generated—both of which seem very “unpythonic.”

Having said all that, experiencing a natural high, and catching my breath, this snippet produces the desired outcome:

Code Snippet 6 (5-19-22)

Technically correct is to say current axes are those drawn last by default. Current axes may be explicitly set as shown here. This is how to vary the target of plt.xticks() to get x-axis labels rotated on both graphs.

Now…

Why is the spacing of x-axis labels different on these two graphs?

I will address that next time.