Asymmetric Wager: systematic strategy

Showing posts with label systematic strategy. Show all posts

Saturday, October 21, 2017

Systematic Trading | Using Autoencoder for Momentum Trading

In a previous post, we discussed the basic nature of various technical indicators and noted some observations. One of the ideas was: at a basic level, most indicators captures the concept of momentum vs mean-reversion. Most do so in the price returns space, but some in a non-linear transformation of the returns space, like signed returns or time since new high/ low. We presented the idea of a PCA approach to extract the momentum signals embedded in these indicators. From there to a trading model, the steps will be to collate this momentum signal (1st PCA component or higher if required) along with other input variables (like returns volatility and/ or other fundamental indicators) to train a separate regression/ classification model (like a random forest or a deep NN).

One of the issues with using simple PCA is that it is linear and hence may not be appropriate to summarize different measures captured across all these indicators. Here we discuss the next logical improvement - a nonlinear dimensional reduction approach using autoencoder.

As discussed here, the new Keras R interface has now made it very easy to develop deep learning models in R using the TensorFlow framework. Here we use this interface to train an autoencoder to fit the same set of technical indicators on NSE Nifty 50 Index as before. The steps involved are relatively straight-forward. First we generate and standardize the inputs (technical indicators levels). Then we build the computation graph.

To do so, first we define the encoding layers (2 hidden layers, the latent coded unit size is 3, to match the first 3 components of the PCA we use for comparison), and two different decoding layers. The two different decoding layers are to enable us to train the auto-encoder as well as compute only decoding independently.

Next we combine these layers to create the computational graph. One for the encoder only, another for the decoder, and a third one for the end-to-end autoencoder, that we will actually train.

The rest of it is standard. We define a loss function to map the input to the output, measuring mean squared losses, and train the model. The training is done on data till 2013, and test set is since 2014 till present. Once the training is done, we can use the encoder and decoder separately to generate a dimensionality reduction of the input space and vice-versa.

The output of the dimensionality reduction is compared with the PCA. As it appears from the correlations, the PCAs are almost one-to-one mapped to the three latent dimensions in the hidden layer generating the encoding. So the encoded layers are orthogonal in our case, although this need not be true always.

	V1	V2	V3
PC1	1	-0.3	0.2
PC2	0.1	-0.2	0.8
PC3	-0.2	-0.9	0.5

The scatter plot below captures the same, but also highlights the some non-linearity, especially the first component of PCA vs the first latent dimension from the autoencoder.

From here the next step is obvious, replace the PCA factors inputs in the momentum trading model in the first paragraph with these latent dimensions from the autoencoder and re-evaluate. This will capture a richer set of inputs that can handle non-linearity and hopefully performs better than linear PCA. Here are some results what other reported (opens PDF). Here are some more (opens PDF) on the using autoencoder for cross-sectional momentum trading. The entire code is available here.

Saturday, May 6, 2017

Markets | VIX - Waiting For Godot

By now everyone and their cats are aware that volatility across markets and asset classes are low, been so for a long time, and shows no signs of reversal. VIX, the US market benchmark vol index is around it's historic lows. The MOVE Index - the bond markets benchmark from BofA/ML - is no better. CVIX - an FX benchmark from Deutsche - is doing a bit better but nothing assuring. People have punted, hoped and feared a come back of volatility, but so far we have not seen any sustained sign of it.

The reasons and the expectations from analysts come under mainly two flavours. The first narrative is that volatility is artificially suppressed by big league volatility sellers (speculators, but more importantly those ETFs folks and systematic risk factors people). The second narrative is market in general is going through a hopeful optimistic patch supported by central bank puts. Both groups believe volatility is going to explode sooner or later. According to the first narrative, a potential driver is a random shock, that will force re-balance in ETFs and risk factors strategies and will amplify the move. The second version is we are just a few bad economic prints or some geo-political mis-steps away from a runaway volatility.

While both of these narratives have some merits, none of them is either sufficient or complete. Or even useful for any practical purpose. There are different opinions, but I tend to side with the arguments from risk factors people (like AQR) that this line of arguments vastly over-estimates the impact of risk factors portfolios. And it is hardly fair to blame some folks for selling vols in a steep roll-down scenario as we have these days (we have written about it before). On top there is certainly some influence from street positioning. As we have written about before, for a long time now, the dominant positions of the big hedgers (read big banks and market making houses) in the markets have been long gamma, putting a stabilizing effect and pinning the vol down. The second "complacency" narrative appears less plausible, but of course cannot be ruled out.

But irrespective of which one (or may be even both) you believe in, none is useful to take a position in volatility. Essentially the argument is: volatility is trading in a distorted way and we need an external event to set it right. It is cheap since such an event will surely come some time in future. Unfortunately, by definition, we cannot predict much about the timing of an unexpected external event. And presumably you do not have the luxury of an infinite stop-loss on the bleeding you will have while you wait for that vol exploding event to materialize.

In fact the only predictable statement to make about the direction of volatility is: when the rates go up, VIX will follow. And here is why.

To start, note that although the VIX is near historical lows, it is not cheap. The realized has been lower. And the second fundamental thing to note that in the post-crisis world, the volatility has transcended its status as just a "fear gauge" and has become an asset class in its own right. And in this world of unconventional monetary policy and low rates, volatility has become intrinsically tied to the level of rates. The chart below captures this point.

We talked about this point way back in 2012 (from bonds markets point of view). When you treat volatility as an asset class (where selling volatility is a surrogate carry strategy) it becomes clear to see the connection. Consider an asset allocator who has an option to either sell volatility and collect the premiums, or buy some equivalently risky carry product, e.g. a high yield corporate bonds portfolio.

To make apple-to-apple comparison, we can think of a hypothetical "volatility bond". Given the existing spread of risky (BBB) bonds to treasury, we can deduce the probability of default of such an investment. From this, we can hypothesize a volatility bond, which consists of selling an out-of-the-money (OTM) call spread and put spread on S&P 500, each 100 point wide. The strike of the short options are such that the probability (implied from volatility) of them ending up in the money is equal to the probability of default of the high yield portfolio above (worth 100 in notional). In both cases the maximum we can lose is $100 (note in the case of short vol strategy, only one of the call or put spread can be in the money and exercised against us). So the yield from the high yield portfolio, and the premium collected (let's call that volatility yield) are comparable returns from portfolios with comparable risks. The chart above shows the yields from these two roughly equivalent portfolios. As we can see, in this rough approximation, the vol yield has in fact been higher than comparable BBB yield through out the post-crisis period, and moved in steps. The relative value before the crisis was unbalanced. It would have paid to buy OTM options spreads, funded by a high yield portfolio (anecdotally, there was an equivalent popular trade there during that time, but in the wrong market - the infamous Japanese widow maker). But at present the markets are pretty much in sync with each other and appear efficient. Far from the "distortion" argument in the narratives above.

The only way the vol can rationally go up from here is if the general risk portfolio yields also go up. That can happen in two ways. Either spread to risk-less rates (like treasury) increases (signifying a risk-off event like in the narratives above). Or through a secular rise in rates - which basically takes us back to Fed and inflation. As argued in the last post, pretty much everything we can expect now hangs on future inflation path.

The results are outcome of an approximate analysis. We obviously ignored some important issues (like skew and convexity of these deep OTM strikes) and made some shortcuts (a digital set-up is more appropriate than a options spreads as in here). We also missed a bit more fundamental point here, which is correlation. S&P 500 is a much broader index than the high yield universe, and the comparison above is more appropriate as the market-wide correlation goes up. As the correlation goes lower, we can afford to sale closer to the money options spread in S&P to retain the same riskiness in the portfolio, thus making the volatility yield even higher. And as we have it, the correlation (again see the last post) is down off late. But the main point remains unchanged - Vol is low but NOT cheap (although last few points in recent time in 2017 points to some relative cheapness).

Perhaps it is a good time to stop complaining about low VIX prints and watch those HY spreads and inflation development carefully instead.

All data from CBOE website/ Yahoo Finance/ Bloomberg

Wednesday, January 4, 2017

Systematic Trading: Back-testing Classical Technical Patterns

Following up from my last post on systematic pattern identification in time series, here is the part on identifying and back-testing classical technical analysis patterns. This is based on the classic paper by Lo, Mamaysky and Wang (2000). The major improvement added here lies in defining local extrema in terms of perceptually important points (as opposed to the kernel regression based slope change technique proposed in the paper). In my view, the kernel method can be too noisy and much less robust with real data.

The R package techchart has two functions for identifying classical technical patterns. The function find.tpattern will sweep through the entire time series and find all pattern matches. It takes in the time series as the first parameter (an xts object), a pattern definition to search for, and a couple of tolerance parameters. The first one is used for matching the pattern itself. The second one pip.tolerance is used for finding the highs and the lows (perceptually important points) on which the pattern matching is based. These tolerance numbers are in terms of multiple of standard deviation. Below is an example:

x <- getSymbols("^GSPC", auto.assign = F)
tpattern <- find.tpattern(x["2015"], tolerance = 0.5, pip.tolerance = 1.5)
chart_Series(x["2015"])

add_TA(tpattern$matches[[1]]$data, on=1, col = alpha("yellow",0.4), lwd=5)

Apart from returning the pattern matches, it also returns some descriptions and characteristics of the match. As below:

summary(tpattern)

## ------pattern matched on: 2015-06-23 --------
## name: Head and shoulder
## type: complete
## move: 1.49 (percentage annualized)
## threshold: 2079.52
## duration: 57 (days)

While this is useful, you already must have spotted the catch. As this function looks at all available data at once to find a pattern, future prices influences past patterns. While this is useful for looking at a time series we need another function for rigorous back-testing. The second function available, find.pattern is to be used for this purpose. This function takes in similar arguments. It returns matched patterns. The match is based on either a completed pattern, or a forming one. A forming pattern is extracted by bumping the last closing price up or down by 1 standard deviation in the next bar and checking if it completes the pattern.

The process of identification of pattern is decoupled from the process of extracting patterns from the data - as proposed in the Lo et al (2000). The pattern defining function in the package is pattern.db. This follows a similar implementation as here by Systematic Investor Blog, with some added features. The implementation of pattern.db in the package techchart contains some basic patterns - head and shoulder (HS), inverse head and shoulder (IHS), broadening top (BTOP) and broadening bottom (BBOT) - the default in the above functions being HS. However it is trivial to define any pattern (as long as it can be expressed in terms of local highs and lows) and customize this pattern library.

With this framework, it becomes quite straightforward to test and analyze pattern performance, run back-test on pattern based strategies and/ or combine patterns along with other indicators to devise trading strategies at any given frequency.

Here is a straightforward implementation of such a back-test, using the quantstrat package. The strategy is quite straightforward. For a given underlying, we scan data for a head-and-should (or inverse head-and-shoulder) match. Once we find a match, we enter a short (long) position if a short term moving average is below (above) a long term one. Once we enter in to a short (long) position, we hold it for at least 5 days, and exit on or after that if a short term moving average is above (below) a long term one. We apply this strategy across S&P500, DAX, Nikkei 225 and KOSPI. The chart below shows the strategy performance.

The thick transparent purple line is the average performance across these underlying indices. The performance metrics are as below. It also has (not shown here) a strong positive skew characteristics.

Performance metrics	S&P	DAX	NKY	KOSPI	ALL
Annualized Return	0.0566	0.0536	0.0678	0.0528	0.0639
Annualized Std Dev	0.1233	0.0982	0.1413	0.1205	0.0692
Annualized Sharpe (Rf=0%)	0.4591	0.546	0.4797	0.4382	0.9234

Not spectacular, but nonetheless interesting. The R code for this back-test is here. Apart from techchart, you would need to install quantmod and quantstrat (and associated packages) to run this. Please note, running this pattern finding algorithm can take considerable time depending on the length of the time series and system characteristics.

Saturday, October 22, 2016

Systematic Trading | An R Package for Automated Technical Analysis

This is an R package for automated technical analysis and some ground stuff for some pattern matching algorithm I plan to build. This is available at github - you can directly install it from github or you can fork or download. Currently it has three functionalities - 1) perceptually important points 2) change points for time series with linear deterministic trends and 3) automated technical support/ resistance/ price envelope identification (useful for back-test, but I have not found the time yet). It has also an undocumented module for technical pattern identification, which is in fluid state. Please note the is in early version and features/ data structures may undergo substantial changes in later version. I copy paste the R vignette below.

Techchart: Technical Feature Extraction of Time Series Data The R package techchart is a collection of tools to extract features from time series data for technical analysis and related quantitative applications. While R is not the most suitable platform for carrying out technical analysis with human inputs, this package makes it possible to extract and match technical features and patterns and use them to back-test trading ideas. At present, the package covers four major areas:

Perceptually Important Points (PIPs) identification
Supports/resistance identification (either based on PIPs or the old-fashioned Fibonacci method)
Change point analysis of trends and segmentation of time series based on underlying trend
Identification of technical envelopes (like trend channels or triangles) of a time series

Perceptually Important Points

PIPs are an effort to algorithmically derive a set of important points as perceived by a human to describe a time series. This typically can be a set of minima or maxima points or a set of turning points which are important from a feature extraction perspective. Traditional technical analysis - like technical pattern identification - relies heavily on PIPs. In addition, a set of PIPs can be used to compress a time series in a very useful way. This compressed representation then can be used for comparing segments of time series (match finding) or other purposes. In this package, we have implemented the approach detailed here.

spx <- quantmod::getSymbols("^GSPC", auto.assign = FALSE)
spx <- spx["2014::2015"]
imppts <- techchart::find.imppoints(spx,2)
head(imppts)

##            pos sign   value
## 2014-02-03  22   -1 1741.89
## 2014-03-07  45    1 1878.52
## 2014-03-14  50   -1 1841.13
## 2014-04-03  64    1 1891.43

quantmod::chart_Series(spx)
points(as.numeric(imppts$maxima$pos),as.numeric(imppts$maxima$value),bg="green",pch=24,cex=1.25)
points(as.numeric(imppts$minima$pos),as.numeric(imppts$minima$value),bg="red",pch=25,cex=1.25)

The function takes in a time series object (in xts format), and a tolerance level for extreme points identification (can be either a percentage or a multiple of standard deviation). It returns an object which has the list of all PIPs identified, marked by either a -1 (minima) or 1 (maxima), as well as the maxima and minima points separately as xts objects

Identification of Change Point in Linear (Deterministic) Trends

Change point analysis has recently become an increasingly important tools for both financial and non-financial time series. There are quite a few packages in R to implement the major algorithms. However, most of them is focused on stationary time series, where in most cases the typical price series encountered in financial market will be non-stationary. The cpt.trend function in this package implement a change point analysis for non-stationary time series to identify multiple changes in the deterministic linear trends. The implementation is based on identifying change in simple regression coefficients (with penalty) and extends to multiple change point identification using the popular binary segmentation methodology. See here for a discussion on different methods. The function find.major.trends extends this functionality to automatically search a time series for the most top level changes in trends by starting with a high value of penalty and decreasing in each step till a set of trends found.

spx <- quantmod::getSymbols("^GSPC", auto.assign = FALSE)
spx <- spx["2014::2015"]
cpts <- techchart::find.major.trends(spx)
summary(cpts)

## change points:
## [1] 411
## segments summary:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    93.0   172.5   252.0   252.0   331.5   411.0

quantmod::chart_Series(spx)

quantmod::add_TA(cpts$segments[[1]],on=1,lty=3, col="red")

quantmod::add_TA(cpts$segments[[2]],on=1,lty=3, col="red")

Supports/ Resistance

Supports and resistance levels are very popular tools for technical analysis. The function find.pivots implements a couple of ways to identify supports and resistance levels for a price series. Using the option FIB will produce a set of Fibonacci levels around the most recent price point. The option SR will run an algorithm to find co-linear points along x-axis (horizontal line) to find levels most tested in recent times. A set of levels as well as xts representation of the lines defined by them are returned

spx <- quantmod::getSymbols("^GSPC", auto.assign = FALSE)
spx <- spx["2014::2015"]
sups <- techchart::find.pivots(spx, type = "FIB")
summary(sups)

## supports and resistance:
## next 3 supports:1982.249 1936.355 1890.461
## next 3 resistance:2130.82

sups <- techchart::find.pivots(spx, type = "SR", strength = 5)
summary(sups)

## supports and resistance:
## next 3 supports:2043.688 1992.551 1895.028
## next 3 resistance:2070.407 2111.588

Price Envelop Identification

Price envelopes features are an integral part of technical analysis. For example technical analysts look for features like trending channel, or ascending triangles etc to identify continuation or breakout from current price actions. The function find.tchannel identifies the most recent such envelopes using an implementation of the popular Hough transform algorithm in image processing, along with some heuristics.

spx <- quantmod::getSymbols("^GSPC", auto.assign = FALSE)
spx <- spx["2016-01-01::2016-09-30"]
tchannel <- techchart::find.tchannel(spx,1.25)
tchannel

## name: channel
## type: neutral
## direction: 0
## threshold: NA

quantmod::chart_Series(spx)

quantmod::add_TA(tchannel$xlines$maxlines[[1]],on=1, lty=3, col="brown")

quantmod::add_TA(tchannel$xlines$minlines[[1]],on=1, lty=3, col="brown")

The function returns an object with parameters of the envelopes found (if any), as well as the xts representation of the envelopes lines

Asymmetric Wager

Pages