Cointegrated ETF Pairs Part II

Update 5/17: As discussed in the comments, the reason the results are so exaggerated is because it is missing portfolio rebalancing to account for the changing hedge ratio. It would be interesting to try an adaptive hedge ratio that requires only weekly or monthly rebalancing to see how legitimately profitable this type of strategy could be.

Welcome back! This week’s post will backtest a basic mean reverting strategy on a cointegrated ETF pair time series constructed using the methods described in part I. Since the EWA (Australia) – EWC (Canada) pair was found to be more naturally cointegrated, I decided to run the rolling linear regression model (EWA chosen as the dependent variable) with a lookback window of 21 days on this pair to create the spread below.

1
Figure 1: Cointegrated Pair, EWA & EWC

With the adaptive hedge ratio, the spread looks well suited to backtest a mean reverting strategy on. Before that, we should check what the minimum capital required to trade this spread is. Though everyone has a different margin requirement, I thought it would be useful to walkthrough how you would calculate the capital required. In this example we assume our broker allows a margin of 50%. We first will compute the daily ratio between the pair, EWC/EWA. This ratio represents the amount of EWA shares for each share of EWC that must be owned to have an equal dollar move for every 1% move. The ratio fluctuates daily but has a mean of 1.43. This makes sense because EWC, on average, trades at higher price. We then multiply these ratios by the rolling beta. Then for reference, we can fix the held EWC shares to 100 and multiply the previous values (ratios*rolling beta) by 100 to determine the amount of EWA shares that would be held. The amount of capital required to hold this spread can then be calculated with the equation: margin*abs((EWC price * 100) + (EWA price * calculated shares)). This is plotted for our example below.

newnew
Figure 2: Required Capital

From this plot we can see that the series has a max value of $5,466 which is not a relatively large required capital. I hypothesize that the less cointegrated a pair is, the higher the minimum capital will be (try the EWZ-IGE pair).

We can now go ahead and backtest the figure 1 time series! A common mean reversal strategy uses Bollinger Bands, where we enter positions when the price deviates past a Z-score/standard deviation threshold from the mean. The exit signals can be determined from the half-life of its mean reversion or it can be based on the Z-score. To avoid look-ahead bias, I calculated the mean, standard deviation, and Z-score with a rolling 50-day window. Unfortunately, this window had to be chosen with data-snooping bias but was a reasonable choice. This backtest will also ignore transaction costs and other spread execution nuances but should still reasonably reflect the strategy’s potential performance. I decided on the following signals:

  • Enter Long/Close Short: Z-Score < -1
  • Close Long/Enter Short: Z-Score > 1

This is a standard Bollinger Bands strategy and results were encouraging.

3

4

Though it made a relatively small amount of trades over 13 years, it boasts an impressive 2.7 Sharpe Ratio with 97% positive trades. Below on the left we can see the strategy’s performance vs. SPY (using very minimal leverage) and on the right the positions/trades are shown.

new2

Overall, this definitely supports the potential of trading cointegrated ETF pairs with Bollinger Bands. I think it would be interesting to explore a form of position sizing based on either market volatility or the correlation between the ETF pair and another symbol/ETF. This concludes my analysis of cointegrated ETF pairs for now.

Acknowledgments: Thank you to Brian Peterson and Ernest Chan for explaining how to calculate the minimum capital required to trade a spread. Additionally, all of my blog posts have been edited prior to being published by Karin Muggli, so a huge thank you to her!

Note: I’m currently looking for a full-time quantitative research/trading position beginning summer/fall 2017. I’m currently a senior at the University of Washington, majoring in Industrial and Systems Engineering and minoring in Applied Mathematics. I also have taken upper level computer science classes and am proficient in a variety of programming languages. Resume: https://www.pdf-archive.com/2017/01/31/coltonsmith-resume-g/. LinkedIn: https://www.linkedin.com/in/coltonfsmith. Please let me know of any open positions that would be a good fit for me. Thanks!

Full Code:

detach("package:dplyr", unload=TRUE)
require(quantstrat)
require(IKTrading)
require(DSTrading)
require(knitr)
require(PerformanceAnalytics)
require(quantstrat)
require(tseries)
require(roll)
require(ggplot2)

# Full test
initDate="1990-01-01"
from="2003-01-01"
to="2015-12-31"

## Create "symbols" for Quanstrat
## adj1 = EWA (Australia), adj2 = EWC (Canada)

## Get data
getSymbols("EWA", from=from, to=to)
getSymbols("EWC", from=from, to=to)
dates = index(EWA)

adj1 = unclass(EWA$EWA.Adjusted)
adj2 = unclass(EWC$EWC.Adjusted)

## Ratio (EWC/EWA)
ratio = adj2/adj1

## Rolling regression
window = 21
lm = roll_lm(adj2,adj1,window)

## Plot beta
rollingbeta <- fortify.zoo(lm$coefficients[,2],melt=TRUE)
ggplot(rollingbeta, ylab="beta", xlab="time") + geom_line(aes(x=Index,y=Value)) + theme_bw()

## Calculate the spread
sprd <- vector(length=3273-21)
for (i in 21:3273) {
sprd[i-21] = (adj1[i]-rollingbeta[i,3]*adj2[i]) + 98.86608 ## Make the mean 100
}
plot(sprd, type="l", xlab="2003 to 2016", ylab="EWA-hedge*EWC")

## Find minimum capital
hedgeRatio = ratio*rollingbeta$Value*100
spreadPrice = 0.5*abs(adj2*100+adj1*hedgeRatio)
plot(spreadPrice, type="l", xlab="2003 to 2016", ylab="0.5*(abs(EWA*100+EWC*calculatedShares))")

## Combine columns and turn into xts
close = sprd
date = as.data.frame(dates[22:3273])
data = cbind(date, close)
dfdata = as.data.frame(data)
xtsData = xts(dfdata, order.by=as.Date(dfdata$date))
xtsData$close = as.numeric(xtsData$close)
xtsData$dum = vector(length = 3252)
xtsData$dum = NULL
xtsData$dates.22.3273. = NULL

## Add SMA, moving stdev, and z-score
rollz<-function(x,n){
avg=rollapply(x, n, mean)
std=rollapply(x, n, sd)
z=(x-avg)/std
return(z)
}

## Varying the lookback has a large affect on the data
xtsData$zScore = rollz(xtsData,50)
symbols = 'xtsData'

## Backtest
currency('USD')
Sys.setenv(TZ="UTC")
stock(symbols, currency="USD", multiplier=1)

#trade sizing and initial equity settings
tradeSize <- 10000
initEq <- tradeSize

strategy.st <- portfolio.st <- account.st <- "EWA_EWC"
rm.strat(portfolio.st)
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD',initEq=initEq)
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)

#SIGNALS
add.signal(strategy = strategy.st,
name="sigFormula",
arguments = list(label = "enterLong",
formula = "zScore < -1", cross = TRUE), label = "enterLong") add.signal(strategy = strategy.st, name="sigFormula", arguments = list(label = "exitLong", formula = "zScore > 1",
cross = TRUE),
label = "exitLong")

add.signal(strategy = strategy.st,
name="sigFormula",
arguments = list(label = "enterShort",
formula = "zScore > 1",
cross = TRUE),
label = "enterShort")

add.signal(strategy = strategy.st,
name="sigFormula",
arguments = list(label = "exitShort",
formula = "zScore < -1",
cross = TRUE),
label = "exitShort")

#RULES
add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "enterLong",
sigval = TRUE,
orderqty = 15,
ordertype = "market",
orderside = "long",
replace = FALSE,
threshold = NULL),
type = "enter")

add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "exitLong",
sigval = TRUE,
orderqty = "all",
ordertype = "market",
orderside = "long",
replace = FALSE,
threshold = NULL),
type = "exit")

add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "enterShort",
sigval = TRUE,
orderqty = -15,
ordertype = "market",
orderside = "short",
replace = FALSE,
threshold = NULL),
type = "enter")

add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "exitShort",
sigval = TRUE,
orderqty = "all",
ordertype = "market",
orderside = "short",
replace = FALSE,
threshold = NULL),
type = "exit")

#apply strategy
t1 <- Sys.time()
out <- applyStrategy(strategy=strategy.st,portfolios=portfolio.st)
t2 <- Sys.time()
print(t2-t1)

#set up analytics
updatePortf(portfolio.st)
dateRange <- time(getPortfolio(portfolio.st)$summary)[-1]
updateAcct(portfolio.st,dateRange)
updateEndEq(account.st)

#Stats
tStats <- tradeStats(Portfolios = portfolio.st, use="trades", inclZeroDays=FALSE)
tStats[,4:ncol(tStats)] <- round(tStats[,4:ncol(tStats)], 2)
print(data.frame(t(tStats[,-c(1,2)])))

#Averages
(aggPF <- sum(tStats$Gross.Profits)/-sum(tStats$Gross.Losses))
(aggCorrect <- mean(tStats$Percent.Positive))
(numTrades <- sum(tStats$Num.Trades))
(meanAvgWLR <- mean(tStats$Avg.WinLoss.Ratio))

#portfolio cash PL
portPL <- .blotter$portfolio.EWA_EWC$summary$Net.Trading.PL

## Sharpe Ratio
(SharpeRatio.annualized(portPL, geometric=FALSE))

## Performance vs. SPY
instRets <- PortfReturns(account.st)
portfRets <- xts(rowMeans(instRets)*ncol(instRets), order.by=index(instRets))

cumPortfRets <- cumprod(1+portfRets)
firstNonZeroDay <- index(portfRets)[min(which(portfRets!=0))]
getSymbols("SPY", from=firstNonZeroDay, to="2015-12-31")
SPYrets <- diff(log(Cl(SPY)))[-1]
cumSPYrets <- cumprod(1+SPYrets)
comparison <- cbind(cumPortfRets, cumSPYrets)
colnames(comparison) <- c("strategy", "SPY")
chart.TimeSeries(comparison, legend.loc = "topleft", colorset = c("green","red"))

## Chart Position
rets <- PortfReturns(Account = account.st)
rownames(rets) <- NULL
charts.PerformanceSummary(rets, colorset = bluefocus)

Cointegrated ETF Pairs Part I

The next two blog posts will explore the basics of the statistical arbitrage strategies outlined in Ernest Chan’s book, Algorithmic Trading: Winning Strategies and Their Rationale. In the first post we will construct mean reverting time series data from cointegrated ETF pairs. The two pairs we will analyze are EWA (Australia) – EWC (Canada) and IGE (NA Natural Resources) – EWZ (Brazil).

1
Figure 1&2: Blue: EWA (left) & EWZ (right), Red: EWC (left) & IGE (right)
2
Figure 3&4: Scatter Plots

EWA-EWC is a notable ETF pair since both Australia and Canada’s economies are commodity based. Looking at the scatter plot, it seems likely that they cointegrate because of this. IGE-EWZ seems less likely to cointegrate but we will discover that it is possible to salvage a stationary series with a statistical adjustment. A stationary, mean reverting series implies that the variance of the log price increases slower than that of a geometric random walk.

Running a linear regression model with EWA as the dependent variable and EWC as the independent variable we can use the resulting beta as the hedge ratio to create the data series below.

3
Figure 5: Cointegrated Pair, EWA & EWC

It appears stationary but we will run a few statistical tests to support this conclusion. The first test we will run is the Augmented Dickey-Fuller, which tests whether the data is stationary or trending. We set the lag parameter, k, to 1 since the change in price often has serial correlations.

4

The ADF test rejects the null hypothesis and supports the stationarity of the series with a p-value < 0.04. The next test we will run is the Hurst Exponent, which will analyze the variance of the log price and compare it to that of a geometric random walk. A geometric random walk has H=0.5, a mean reverting series has H<0.5, and a trending series has H>0.5. Running this test on the log residuals of the linear model gives a Hurst Exponent of 0.27, supporting the ADF’s conclusion. Since this series is now surely stationary, the final analysis we will do is find its half-life of the mean reversion. This is useful for trading as it gives you an idea of what the holding period of the strategy will be. The calculation of the half-life involves regressing y(t)-y(t-1) against y(t-1) and using the lambda found. See my code for further explanation. The half-life of this series is found to be 67 days.

Next, we will look at the IGE-EWZ pair. Running a linear regression model with IGE as the dependent variable and EWZ as the independent variable we can use the resulting beta as the hedge ratio to create the data series below.

5
Figure 6: Cointegrated Pair, EWZ & IGE

Compared to the EWA-EWC pair, this looks a lot less stationary which makes sense considering the price series and scatter plot. Additionally, the ADF test is inconclusive.

6

The half-life of its mean reversion is calculated to be 344 days. In this form, it is definitely not a very practical pair to trade. Something that may improve the stationarity of this time series is to use an adaptive hedge ratio, determined from using a rolling linear regression model with a designated lookback window. Obviously, the shorter the lookback window, the more that the beta/hedge ratio will fluctuate. Though this would require daily portfolio adjustments, ideally the stationarity of the series will increase substantially. I began with a lookback window of 252, the number of trading days in a year, but it didn’t have large enough of an impact.  Therefore, we will try 21, the average number of trading days in a month, which will result in a significant impact. Without the rolling regression, the beta/hedge ratio was 0.42. Below you can see how the beta changes over time and how it affects the mean reverting data series.

7
Figure 7: Beta/Hedge Ratio vs. Time
8
Figure 8: Cointegrated Pair, EWZ & IGE, w/ Adaptive Hedge Ratio

9

With the adaptive hedge ratio, the ADF test strongly concludes that the time series is stationary. This also significantly cuts down the half-life of the mean reversion to only 33 days.

Though there are a lot more analysis techniques for cointegrated ETF pairs, and even triplets, this post explored the basics of creating two stationary data series. In next week’s post, we will implement some mean reversion trading strategies on these pairs. See ya next week!

Full Code:

require(quantstrat)
require(tseries)
require(roll)
require(ggplot2)

## EWA (Australia) - EWC (Canada)
## Get data
getSymbols("EWA", from="2003-01-01", to="2015-12-31")
getSymbols("EWC", from="2003-01-01", to="2015-12-31")

## Utilize the backwards-adjusted closing prices
adj1 = unclass(EWA$EWA.Adjusted)
adj2 = unclass(EWC$EWC.Adjusted)

## Plot the ETF backward-adjusted closing prices
plot(adj1, type="l", xlab="2003 to 2016", ylab="ETF Backward-Adjusted Price in USD", col="blue")
par(new=T)
plot(adj2, type="l", axes=F, xlab="", ylab="", col="red")
par(new=F)

## Plot a scatter graph of the ETF adjusted prices
plot(adj1, adj2, xlab="EWA Backward-Adjusted Prices", ylab="EWC Backward-Adjusted Prices")

## Linear regression, dependent ~ independent
comb1 = lm(adj1~adj2)

## Plot the residuals or hedged pair
plot(comb1$residuals, type="l", xlab="2003 to 2016", ylab="Residuals of EWA and EWC regression")

beta = coef(comb1)[2]
X = vector(length = 3273)
for (i in 1:3273) {
X[i]=adj1[i]-beta*adj2[i]
}

plot(X, type="l", xlab="2003 to 2016", ylab="EWA-hedge*EWC")

## ADF test on the residuals
adf.test(comb1$residuals, k=1)
adf.test(X, k=1)

## Hurst Exponent Test
HurstIndex(log(comb1$residuals))

## Half-life
sprd = comb1$residuals
prev_sprd <- c(sprd[2:length(sprd)], 0)
d_sprd <- sprd - prev_sprd
prev_sprd_mean <- prev_sprd - mean(prev_sprd)
sprd.zoo <- merge(d_sprd, prev_sprd_mean)
sprd_t <- as.data.frame(sprd.zoo)

result <- lm(d_sprd ~ prev_sprd_mean, data = sprd_t)
half_life <- -log(2)/coef(result)[2]

#######################################################################################################

## EWZ (Brazil) - IGE (NA Natural Resource)
## Get data
getSymbols("EWZ", from="2003-01-01", to="2015-12-31")
getSymbols("IGE", from="2003-01-01", to="2015-12-31")

## Utilize the backwards-adjusted closing prices
adj1 = unclass(EWZ$EWZ.Adjusted)
adj2 = unclass(IGE$IGE.Adjusted)

## Plot the ETF backward-adjusted closing prices
plot(adj1, type="l", xlab="2003 to 2016", ylab="ETF Backward-Adjusted Price in USD", col="blue")
par(new=T)
plot(adj2, type="l", axes=F, xlab="", ylab="", col="red")
par(new=F)

## Plot a scatter graph of the ETF adjusted prices
plot(adj1, adj2, xlab="EWA Backward-Adjusted Prices", ylab="EWC Backward-Adjusted Prices")

## Rolling regression
## Trading days
## 252 = year
## 21 = month
window = 21
lm = roll_lm(adj1,adj2,window)

## Plot beta
rollingbeta <- fortify.zoo(lm$coefficients[,2],melt=TRUE)
ggplot(rollingbeta, ylab="beta", xlab="time") + geom_line(aes(x=Index,y=Value)) + theme_bw()

## X should be the shifted residuals
X <- vector(length=3273-21)
for (i in 21:3273) {
X[i-21] = adj2[i]-rollingbeta[i,3]*adj1[i]
}

plot(X, type="l", xlab="2003 to 2016", ylab="IGE-hedge*EWZ")

Social Media Sentiment Analysis and Trading Strategies

Happy New Year! I recently got the opportunity to start doing some work for Ernest Chan’s team at QTS Capital Management and my first project was a literature review of social media sentiment analysis. The PowerPoint presentation above covers the current academic research on social media sentiment analysis, the trading strategies that incorporate social media sentiment, an analysis of the various providers of sentiment data, and much more! If you have any questions or would like the 50+ pages of notes that accompany the presentation, please contact me at coltonsmith321@gmail.com.

Additionally, over the holidays I got a chance to read both of Mr. Chan’s books, Quantitative Trading: How to Build Your Own Algorithmic Trading Business and Algorithmic Trading: Winning Strategies and Their Rationale. I’d highly recommend reading them if you haven’t! They provided valuable insight into properly approaching backtesting and gave me countless new statistical arbitrage strategies to explore. I’m going to have a lot more time this quarter to work on projects for the blog so expect weekly posts!

MACD + SMI Trend Following and Parameter Optimization

Finally a somewhat profitable strategy to analyze! This post will walk through the development of my MACD + SMI strategy, including my experience with parameter optimization and trailing stops. This strategy began with an interest in the Moving Average Convergence/Divergence oscillator (MACD), which I hadn’t yet explored. Also, since the two previous strategies I analyzed were mean-reversion strategies, I thought it’d be good to try out a trend-following strategy. The MACD uses two trend-following moving averages to create a momentum indicator. I used the standard 12-period fast EMA, 26-period slow EMA, and 9-period signal EMA parameters. Although there are a lot of different signals that traders can look at when using the MACD, I kept it simple and was only interested when the MACD (fast EMA – slow EMA) was above/below the signal line. When the MACD is positive it indicates that the upside momentum is increasing, and vice versa for negative values. I then did some research to see what indicators were combined with the MACD. The two that caught my interest were the Stochastic Momentum Index (SMI) and the Chande Momentum Oscillator (CMO). The SMI compares closing prices to the median of the high/low range of prices over a certain period, which makes it a more refined and sensitive version of the Stochastic Oscillator. The values range between -100 and +100, with values less than -40 indicating a bearish trend and values greater than 40 indicating a bullish trend. Normally the SMI can be used similarly to the RSI and indicate overbought/oversold market conditions, but I wanted to focus on using it as a general trend indicator. The CMO indicator is also a momentum oscillator that can be used to confirm possible trends; my backtests confirmed its viability but I decided to center my focus on the SMI. I initially used the standard SMI threshold values (-40/+40) and backtested across the same 30 ETFs from my last post during 2003-2015.

1

2

This was my first time seeing such a high profit factor and a Sharpe ratio fairly close to 1, but unfortunately, the number of trades was too low. A minimum of 700/800 trades is necessary in this backtest to support strategy performance conclusions, however I was happy to see signs of a profitable strategy. As a note, ATR position sizing was used in every strategy backtest, which nearly doubles their Sharpe Ratio. To increase the number of trades made by this strategy I had a couple of ideas. First, since this strategy only enters long positions I tried playing around with the indicators to see if it was also good for entering short positions. This unfortunately was not a profitable attempt. Second, I thought about exploring a separate shorting strategy that made ~400 trades, and simply putting them together. I decided for the sake of analysis it would be better to stick to one main strategy this time, but this is something I’ll consider in the future. Third and finally, after learning about the ATR position sizing I have wanted to experiment with another risk management tool, trailing stops. I added a 7% trailing stop and got the results below. It sacrificed some profit factor for a better Sharpe ratio, but it also made over 700 trades. Although this seems like artificially increasing the number of trades, I was content with the results for the time being.

3

Next, I wanted to see if I could push that profit factor over 4 by optimizing the parameters, namely the SMI thresholds and trailing stop percentages. I did minor parameter optimization in my first post, but this was my first time doing it on a larger scale. I decided to split my time period in half, 2003-2009 and 2010-2015, optimize on the first period, and then use the second period for an out of sample test. I didn’t expect to see a large difference of performance between the two periods, but I was very wrong. I decided to first optimize the SMI thresholds and found ridiculously large profit factors. From this I concluded that 40 was the best entrance, but that the exit signal definitely had the largest impact. I then chose to further optimize the 40/55 (highest profit factor), 40/30 (highest Sharpe ratio), and 40/40 (arguably the best mix of profit factor and Sharpe ratio to support why they’re the standard thresholds).

Optimization 2003-2009
SMI Enter (+) SMI Close (-) Profit Factor Trades Sharpe Ratio
40 45 14.92 152 1.13
40 50 18.25 127 0.99
40 35 9.25 225 1.24
40 30 8.18 258 1.29
40 55 22.14 105 0.91
40 40 11.29 191 1.24
35 40 10.33 201 1.25
30 40 9.94 210 1.27
45 40 11.12 185 1.22
50 40 10.49 179 1.17

Then I tested the 3 strategies with 5%, 10%, 15%, and 20% trailing stops. Based on their performance, I decided which ones to test on the out of sample period and the whole sample period.

Optimization 2003-2009
SMI Enter (+) SMI Close (-) Trailing Stop Profit Factor Trades Sharpe Ratio Continue?
40 40 5% 4.95 535 1.67 *
40 40 10% 7.49 307 1.51 *
40 40 15% 7.94 248 1.36
40 40 20% 9.22 213 1.33 *
40 55 5% 5.06 528 1.7 *
40 55 10% 8.4 286 1.53 *
40 55 15% 9.45 207 1.4
40 55 20% 10.85 163 1.3 *
40 30 5% 4.97 539 1.7 *
40 30 10% 6.82 335 1.51
40 30 15%
40 30 20%
OOS 2010-2015
SMI Enter (+) SMI Close (-) Trailing Stop Profit Factor Trades Sharpe Ratio Continue?
40 40 5% 2.3 436 0.77 *
40 40 10% 2.23 277 0.53 *
40 40 15%
40 40 20% 2.23 221 0.48
40 55 0.05 2.45 430 0.82 *
40 55 10% 2.84 244 0.68 *
40 55 15%
40 55 20% 3.92 146 0.63
40 30 5% 2.17 447 0.71 *
40 30 10%
40 30 15%
40 30 20%
Whole Period
SMI Enter (+) SMI Close (-) Trailing Stop Profit Factor Trades Sharpe Ratio
40 40 5% 3.38 977 1.23
40 40 10% 3.95 586 1.01
40 40 15%
40 40 20%
40 55 0.05 3.54 963 1.27
40 55 10% 4.65 531 1.09
40 55 15%
40 55 20%
40 30 5% 3.28 993 1.21
40 30 10%
40 30 15%
40 30 20%

I was shocked at how much worse the strategies performed in the out of sample period compared to the optimization period. It was really quite interesting, and I suspect there had to be significantly different market conditions to make these differences apparent for all of the strategies. Across the board the best performing strategy was 40/55, with fairly impressive profit factors and Sharp ratios. The higher closing threshold is probably a result of the general upward trending market. This strategy performed best with a 5% or 10% trailing stop but it made almost double the trades with 5% so I decided to try my original 7% trailing stop. The results are below.

4

5

6

Overall, not a bad strategy, with a profit factor above 4 and a decent Sharpe Ratio. It also handled 2008 very nicely. The buy and hold inter-instrument correlation of the 30 ETFs is 0.71 and this strategy cuts it in half. This means it is a well diversified and risk managed portfolio strategy.

7

I’m going to continue to search for profitable strategies and maybe look for a short-term, aggressive strategy to analyze next. I also want to use other metrics, such as the Calmar or Information ratio instead of the Sharpe ratio to get a bigger picture of a strategy’s risk management. Additionally, I hope to apply R’s extensive machine learning capabilities to a strategy in the near future. Thanks for reading!

Acknowledgements: Thank you to Ilya Kipnis and Ernest Chan for their continual help.

Full code:

detach("package:dplyr", unload=TRUE)
require(quantstrat)
require(IKTrading)
require(DSTrading)
require(knitr)
require(PerformanceAnalytics)

# Full test
initDate="1990-01-01"
from="2003-01-01"
to="2015-12-31"

# Optimization set
# initDate="1990-01-01"
# from="2003-01-01"
# to="2009-12-31"

# OOS test
# initDate="1990-01-01"
# from="2010-01-01"
# to="2015-12-31"

#to rerun the strategy, rerun everything below this line
source("demoData.R") #contains all of the data-related boilerplate.

#trade sizing and initial equity settings
tradeSize <- 10000
initEq <- tradeSize*length(symbols)

strategy.st <- portfolio.st <- account.st <- "TVI_osATR"
rm.strat(portfolio.st)
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD',initEq=initEq)
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)

#parameters (trigger lag unchanged, defaulted at 1)
period = 20
pctATR = .02 #control risk with this parameter
trailingStopPercent = 0.07

#INDICATORS
add.indicator(strategy = strategy.st,
name = "MACD",
arguments = list(x = quote(Cl(mktdata)),
nFast = 12, nSlow = 26, nSig = 9,
maType = "EMA", percent = TRUE),
label = "MACD")

add.indicator(strategy = strategy.st,
name = "SMI",
arguments = list(HLC = quote(HLC(mktdata)), n = 13,
nFast = 2, nSlow = 25, nSig = 9,
maType = "EMA", bounded = TRUE),
label = "SMI")


add.indicator(strategy.st, name="lagATR",
arguments=list(HLC=quote(HLC(mktdata)), n=period),
label="atrX")

#SIGNALS
add.signal(strategy = strategy.st,
name="sigFormula",
arguments = list(label = "closeLong",
formula = "(macd.MACD < signal.MACD & SMI.SMI < -55)",
cross = TRUE),
label = "closeLong")

add.signal(strategy = strategy.st,
name="sigFormula",
arguments = list(label = "buyLong",
formula = "(macd.MACD > signal.MACD & SMI.SMI > 40)",
cross = TRUE),
label = "buyLong")

#RULES

add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "buyLong",
sigval = TRUE,
ordertype = "market",
orderside = "long",
replace=FALSE, prefer="Open", osFUN=osDollarATR,
tradeSize=tradeSize, pctATR=pctATR, atrMod="X"),
type="enter", path.dep=TRUE, label = "LE")

add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "buyLong",
sigval = TRUE,
replace = FALSE,
orderside = "long",
ordertype = "stoptrailing",
tmult = TRUE,
threshold = quote(trailingStopPercent),
orderqty = "all",
orderset = "ocolong"),
type = "chain",
parent = "LE",
label = "StopTrailingLong",
enabled = FALSE)

add.rule(strategy = strategy.st,
name = "ruleSignal",
arguments = list(sigcol = "closeLong",
sigval = TRUE,
orderqty = "all",
ordertype = "market",
orderside = "long",
threshold = NULL),
type = "exit")

enable.rule(strategy.st, type = "chain", label = "StopTrailingLong")

#apply strategy
t1 <- Sys.time()
out <- applyStrategy(strategy=strategy.st,portfolios=portfolio.st)
t2 <- Sys.time()
print(t2-t1)


#set up analytics
updatePortf(portfolio.st)
dateRange <- time(getPortfolio(portfolio.st)$summary)[-1]
updateAcct(portfolio.st,dateRange)
updateEndEq(account.st)

#Stats
tStats <- tradeStats(Portfolios = portfolio.st, use="trades", inclZeroDays=FALSE)
tStats[,4:ncol(tStats)] <- round(tStats[,4:ncol(tStats)], 2)
print(data.frame(t(tStats[,-c(1,2)])))

#Averages
(aggPF <- sum(tStats$Gross.Profits)/-sum(tStats$Gross.Losses))
(aggCorrect <- mean(tStats$Percent.Positive))
(numTrades <- sum(tStats$Num.Trades))
(meanAvgWLR <- mean(tStats$Avg.WinLoss.Ratio))

#portfolio cash PL
portPL <- .blotter$portfolio.TVI_osATR$summary$Net.Trading.PL

#Cash Sharpe
(SharpeRatio.annualized(portPL, geometric=FALSE))

#Individual instrument equity curve
# chart.Posn(portfolio.st, "IYR")

instRets <- PortfReturns(account.st)
portfRets <- xts(rowMeans(instRets)*ncol(instRets), order.by=index(instRets))

cumPortfRets <- cumprod(1+portfRets)
firstNonZeroDay <- index(portfRets)[min(which(portfRets!=0))]
getSymbols("SPY", from=firstNonZeroDay, to="2015-12-31")
SPYrets <- diff(log(Cl(SPY)))[-1]
cumSPYrets <- cumprod(1+SPYrets)
comparison <- cbind(cumPortfRets, cumSPYrets)
colnames(comparison) <- c("strategy", "SPY")
chart.TimeSeries(comparison, legend.loc = "topleft", colorset = c("green","red"))

#Correlations
instCors <- cor(instRets)
diag(instRets) <- NA
corMeans <- rowMeans(instCors, na.rm=TRUE)
names(corMeans) <- gsub(".DailyEndEq", "", names(corMeans))
print(round(corMeans,3))
mean(corMeans)

SMA 200 + RSI 2 w/ATR Position Sizing Strategy Analysis

Hello all! For my second post I decided to analyze an aggressive short-term strategy. While researching, I found a good amount of literature supporting the potential of SMA 200 and RSI 2 strategies, so it seemed like a solid place to begin. I was also reading through previous posts on the Quantstrat Trader blog and ATR position sizing caught my attention so I decided to add it to the strategy (https://quantstrattrader.wordpress.com/2014/06/11/trend-vigor-part-iii-atr-position-sizing-annualized-sharpe-above-1-4-and-why-leverage-is-pointless/). Parts of the code and analysis later in this post was influenced by the Quantstrat Trader blog and Ilya Kipnis himself, so a big thank you to him! Though the tested strategies proved to not be as successful as I had originally hoped, I was able to draw valuable conclusions from the data while improving my backtesting skills.

To start, I analyzed 2 variations of a strategy that used 4 indicators. In my previous post, the Bollinger Bands and RSI strategy used a 14-period RSI indicator. This strategy uses a 2-period RSI, which makes it a much more sensitive momentum indicator. Two simple moving averages, 5 and 200, were used to track short and long term market trends respectively. Finally, a lagging average true range (ATR) indicator, provided by the IKTrading package, was used to monitor market volatility. This allowed me to adjust trade sizes and normalize risk. The first version of the strategy, V1, uses the 200-day SMA to determine the market trend direction (closing price above/below) and the 2-period RSI to identify overbought/oversold situations that could mean revert. It then makes quick exits, hopefully closing a profitable trade on the long or short side, when the mean reversion is done, indicated by the closing price and 5-day SMA relationship. The signals are shown below:

  • Enter long: Close > SMA 200 & RSI 2 < 20
  • Enter short: Close < SMA 200 & RSI 2 > 80
  • Exit long: Close > SMA 5
  • Exit short Close < SMA 5

The second version of the strategy, V2, is similar to V1 except it only enters long side positions and has an additional exit signal based on the 2-period RSI. The signals are shown below:

  • Enter long: Close > SMA 200 & RSI 2 < 20
  • Exit long: Close > SMA 5
  • Exit long: RSI 2 > 80

Both of these strategies were ran with an ATR position sizing function, using a 20-period moving average and risk percentage of 2%. They were backtested on the following ETFs from 2003-2016, and SPY from 2010-2016/2000-2016. The metrics primarily used to analyze the results were profit factor (gross profits/gross losses), percent positive, and annualized Sharpe ratio. Both strategies made about ~10,000 trades across the ETFs 13 year period. Together these metrics give a fairly complete view of the strategy’s profitability and risk factor.

etfs

Strategy V1’s Performance

v1-1
ETFs w/o ATR Position Sizing
v1-2
ETFs w/ ATR Position Sizing
SPY 2010-2016 2000-2016
ATR? No Yes No Yes
Percent Positive 63.5% 62.8% 64.6% 64.5%
Profit Factor 1.59 1.31 1.72 1.44
Annual Sharpe 1.99 1.51 2.27 2.08

Overall, with or without ATR position sizing, this was not a winning strategy as implemented. Most surprisingly, position sizing negatively impacted the profit factor, percent positive, and annualized Sharpe. The strategy did display its ability to adjust to market volatility.

iyrwatr

The above figure shows the strategy’s performance on IYR, the US Real Estate ETF. Looking at the ETF price, position fill (blue columns), and ATR indicator (bottom blue line), you can see the gradual reduction in position sizing leading up to the 2008 recession and the transition to short side positions. The implementation of the ATR position sizing was a definite success in appropriately adjusting the trade sizes based on market volatility which is shown nicely in this figure.

After analyzing the position fill on this ETF as well as others, a common flaw of this strategy seems to be holding long positions too long into a downturn and consequently not entering short positions soon enough. As visible in the figure above, these transitions are quite mistimed. If this strategy could improve these timings I believe it would drastically improve its profitability.

Strategy V2’s Performance

v2-1
ETFs w/o ATR Position Sizing
v2-2
ETFs w/ ATR Position Sizing
SPY 2010-2016 2000-2016
ATR? No Yes No Yes
Percent Positive 60.7% 64.8% 61.4% 63.7%
Profit Factor 1.55 1.48 1.40 1.33
Annual Sharpe 1.59 2.15 1.20 1.66

The results of this strategy told a slightly different story. Across the ETFs and SPY, the profit factor was negatively affected by ATR position sizing, but the Sharpe ratio was improved. Initially I hypothesized that the ATR position sizing should increase the Sharpe ratio since it prevents huge losses by decreasing risk volatility, so this was exciting to see.

ls_iyrwatr

Once again, visible in the above IYR figure above, the strategy does a fairly good job at sitting out severe downtrends. Strategy V1’s trade-side distribution was about 80% long and 20% short so this strategy was essentially just the long trades with the additional exit signal that may or may not have been beneficial.

I also tested these strategies with more sensitive signals by changing the RSI thresholds to 90/10, but that did not improve their performance. Overall, these strategies let me experiment with a short term trading strategy and the metrics used to evaluate them, which will be useful in future projects. With regard to the ATR position sizing, I strongly believe that it would be much more helpful when applied to a long term trading strategy. In the case of a short term trading strategy, the trade sizes become essentially irrelevant as the number of trades increases making the strategy’s percent positive, win-loss ratio, etc. more important. In a long term trading strategy where positions are potentially held for months, the trade size is much more important. Based on the qualitative impact ATR position sizing had on this strategy, I now understand how much of an essential risk management tool it is.

Another tool used to help adjust for risk that I want to incorporate into my next strategy is trailing stops. The figure below is the MAE plot for strategy V1 which can be used to determine an appropriate trailing stop. For example, a trailing stop of 4% added to this strategy would cut a significant portion of the losses.

risk

Now that I have a basic understanding of Quantstrat and evaluating strategy performance, I plan to focus on finding more profitable strategies in my next projects. I still want to dive into something a little more statistically/mathematically dense. Currently I am taking an advanced applied mathematics course, so maybe I’ll get some ideas in there! Additionally, I eventually want to incorporate machine learning into a strategy. Thanks for reading!

Bollinger Bands and RSI Analysis with Quantstrat

Hello, my name is Colton Smith and this is my first hands on experience backtesting quantitative trading strategies. For the last 6 months quantitative trading has been a personal topic of interest that I have dedicated a lot of time researching to.  I’m currently a senior at the University of Washington, majoring in Industrial and Systems Engineering with a minor in Applied Mathematics. Outside of my major, I have taken computer sceince courses and been heavily involved in the UW Math Club which I am now the President of. With this background and experience I have become very familiar with statistics, probability, and R. This upcoming year I will be taking more computer science courses and continuing quant projects to further prepare myself for internship and job oppurtunites.

The most logical place to begin was the Quanstrat package in R. I wanted to start with a simple strategy that would let me explore the different functions of Quanstrat and play around with some data. Being already familiar with Bollinger Bands, a volatility indicator, I thought it would be interesting to analyze them combined with RSI, a momentum indicator.

Therefore, I chose to analyze this strategy from 01/01/2010 to 12/31/2015 because no combination handled 2008 very well and I wanted consistent data to work with. The Bollinger Bands indicator used a simple moving average with n = 20 and RSI used a weighted moving average with n = 14. The buy signal was when the close was below the lower Bollinger Band and the RSI was below the lower threshold which would indicate the stock may be oversold and could mean revert upwards. The sell signal was when the close was above the upper Bollinger Band and the RSI was above the upper threshold which would indiacte the stock may be overbought and could mean revert downwards. I decided to gather data for the 150 different combinations of Bollinger Band standard deviations (1,1.25,1.5,1.75,2,2.25), upper RSI thresholds (60,65,70,75,80), and lower RSI thresholds (20,25,30,35,40). These ranges were chosen to ensure that all combinations executed a decent amount of trades during this period. Some combinations (<5%) still didn’t execute very many trades, which may have skewed some of the data. These strategies were tested on SPY with a starting account equity of $100,000 and each buy signal making a purchase of 500 shares. I chose to analyze the combination’s performance based on its CAGR, Compound Annual Growth Rate. For reference, the SPY’s CAGR in this time period was 10.6%.

1
Figure 1
2
Figure 2

Figure 1 displays the results of each combination. The best performing combination was with Bollinger Bands standard deviations of 1 and the upper/lower RSI thresholds set at 80/40 which had nearly a 40% CAGR. Looking at figure 1 and 2 together you can see how the standard deviation of the strategy’s performance shrinks as larger Bollinger Bands standard deviations are used. This is because when starting at the beginning of figure 1, where lower standard deviations are used, the RSI thresholds are controlling the strategy. However, when the Bollinger Bands standard deviations increase, they dominate the strategy. Additionally, as displayed in figure 2, the convention that using Bollinger Bands with standard deviations of 2 for the best performance, is supported.

As visible from the highlighted combinations in figure 1 and from the table below, strategies with higher upper and lower RSI thresholds performed better because they bought more often and sold less frequently. I believe this worked well because of the upward market trend. The 3d surface plot below helps visualize the influence of the upper and lower RSI thresholds. This plot is with Bollinger Bands standard deviations of 2 and nicely shows how increasing the lower RSI threshold has a drastic effect on the CAGR, much more than the upper RSI threshold does. For reference, the red dots were combinations that didn’t beat the SPY CAGR, the yellow dots were ones that outperformed the SPY CAGR but were below 25%, and the green dots were combinations that had CAGR of above 25%.

2
Figure 3

Overall, this project was very helpful to learn how to work with multiple signals and paramsets in Quanstrat, as well as some data visualization techniques. Next, I plan to explore a strategy where I can utilize my statistical knowledge with one that makes much more frequent trades so I can analyze risk and profit factors.

Acknowledgements: Thank you to Ernest Chan, Ilya Kipnis, and Brian Peterson for answering my elementary questions about Quantstrat and backtesting.