I'm a College Student, and I'm Still Building my Robo-Advisor (Part 2)
We’ve now created a functional robo advisor on Quantopian. But, the IDE puts limits on the scale and sophistication of our work. Let's extend this further with Zipline.
All code and outside instructions for setup, etc. can be found in the GitHub repository here. This is a post is a sequel to building a robo advisor in Quantopian. You can find the first part here.
Pushing Development with Zipline
Hey all, I’m Rao, an intern at Alpaca. About two weeks ago, I published an article about the Robo-Advisor I’ve been working on. In the last two weeks, I’ve moved my work offline, pushing development with Zipline. In that time, I’ve picked up a few tips and tricks, and I wanted to share that with you.
Zipline — What, When, Why, Where, How?
We’ve now created a functional robo advisor on Quantopian. But, the IDE puts limits on the scale and sophistication of our work. To extend this robo advisor further, it’s best to do our future work locally with Zipline.
This shift offers advantages. We can use multiple files, configuration files, and generally develop out robo-advisor like any other python app. Being free from a single file format means that our work can be more organized, and individual functions less bloated (more on that later).
But, it’s not all roses. Developing in Zipline comes with its own challenges. But, if I can do it, you can too!
Installing zipline on your local machine isn’t as simple as `pip install zipline`. It has several dependencies, which vary slightly for each OS. You can find all necessary installation instructions here.
I made several mistakes during installation, so I recommend you run zipline in either a virtual environment, or my personal preference, in a Docker container. I’ve attached both the finished container to play with, as well as the corresponding Dockerfile. Take some time, figure out your development environment. Ready? Let’s get started.
Zipline Commands — Bundles and Execution
Open a new python file locally. Copy and paste your code from the Quantopian IDE. Now, we’re ready to go- just yanking you around. We’re not close to ready. Quantopian’s IDE is a great development environment. It takes care of background processes, allowing you to focus solely on your algorithm.
First, we need historical data to run our algorithm on. Zipline reads through data using something called bundles. To use the bundles, they need to be ingested first, which is done using the command:
$zipline ingest -b <bundle>
Zipline provides two bundles, quandl and quantopian-quandl. To ingest quandl, you’ll need to make an account, and then obtain an API key. You won’t need to for quantopian-quandl. Zipline also provides the option to create custom bundles, by writing a custom ingest function. That will be reserved for a later post.
Both Quandl and quantopian-quandl don’t provide ETF data, so I’ve provided a custom alpaca bundle. The instructions for installing this bundle locally can be found in the GitHub repository readme.
You run zipline code using the following format:
$zipline run -f <filename> -b <bundle> --start <date> --end <date>
You can see a lot of the GUI input from Quantopian is configured from the command line with Zipline.
Docker Containers:
I’ve created a docker container running python 3.6 with zipline and necessary dependencies installed. If you’re hesitant about installing zipline locally, you can pull the container from the docker hub, and experiment with the environment.
Run the image with the command:
$docker run -it alpaca/roboadvisor /bin/bash
A First Zipline Example — Buy and Hold
Like my initial start with Quantopian, I started with Zipline using a simple buy and hold strategy.
from zipline.api import *
import datetime
def initialize(context):
context.stocks = symbols('VTI', 'VXUS', 'BND', 'BNDX')
context.bought = False
risk_level = 5
risk_based_allocation = {0: (0,0,0.686,0.294),
1: (0.059,0.039,0.617,0.265),
2: (0.118,0.078,0.549,0.235),
3: (0.176,0.118,0.480,0.206),
4: (0.235,0.157,0.412,0.176),
5: (0.294,0.196,0.343,0.147),
6: (0.353,0.235,0.274,0.118),
7: (0.412,0.274,0.206,0.088),
8: (0.470,0.314,0.137,0.059),
9: (0.529,0.353,0.069,0.029),
10: (0.588,0.392,0,0)}
#Saves the weights to easily access during rebalance
context.target_allocation = dict(zip(context.stocks, risk_based_allocation[risk_level]))
def handle_data(context, data):
if not context.bought:
for stock in context.stocks:
if (context.target_allocation[stock] == 0):
continue
amount = (context.portfolio.cash * context.target_allocation[stock]) / data.current(stock, 'price')
order(stock, int(amount))
print("Ordered " + str(int(amount)) + " shares of " + str(stock))
context.bought = True
Here are a few differences to keep in mind. First, all the custom functions like order, symbols, etc. are no longer automatically included, and have to be manually imported (from zipline.api import *). Next, this is rather silly, but now that we’re not in Quantopian’s IDE, we’re no longer going to use log.info to track transactions, but print statements.
Let’s run the code with the format from earlier, and run the tests from January to June:
$zipline run -f buy-and-hold.py -b alpaca --start 2018-01-01 --end 2018-06-01
$zipline run -f buy-and-hold.py -b alpaca --start 2018-01-01 --end 2018-06-01
If there’re no syntax errors, Zipline will spit out a whole bunch of data. How do we know if we’re right? Scroll until you find the following table from STDOUT:
Find the cumulative alpha value (performance against benchmark), and compare it to the cumulative alpha value when you run the same algorithm on Quantopian. They should be very similar values.
We’ve got a good idea of how Zipline works, so we can go ahead and implement the rest of our single-universe algorithm (distance computation and rebalancing). I spent an entire post talking about this implementation, so I don’t think there’s a need to re-hash it. If you haven’t read that, I recommend you do — it’s a great read! (a totally unbiased opinion)
Multiple Universes
In the previous post, I showed how to expand the algorithm to cover all possible Vanguard universes by adding multiple dictionaries in initialize function. But, if we want to implement all six vanguard universes, we would finish with a bloated ingest function, which doesn’t look that pretty.
But now we’re away from Quantopian, and here with Zipline! We can spread our code across multiple files, so let’s go ahead and do that.
To be used by the algorithm, a universe needs two sets of information. The first is the set of symbols, and the second is the weight distribution based on the risk. From the way that the symbols function works, each universe’s list of symbols needs to set in the initialization function.
But, the weights are dictionaries that aren’t bound to any zipline.api functions, so we can actually configure those in a separate file. The ConfigParser package can read in data stored in an external .ini file. More importantly, it stores the data in a similar structure to a dictionary (key/value), and is easily called and organized.
First, we need to install the package:
$pip3 install -f ConfigParser
With the package installed, it’s time to create the INI file. The INI file contains the information for allocation based on risk level.
[CORE_SERIES]
0 = (0,0,0.686,0.294)
1 = (0.059,0.039,0.617,0.265)
2 = (0.118,0.078,0.549,0.235)
3 = (0.176,0.118,0.480,0.206)
4 = (0.235,0.157,0.412,0.176)
5 = (0.294,0.196,0.343,0.147)
6 = (0.353,0.235,0.274,0.118)
7 = (0.412,0.274,0.206,0.088)
8 = (0.470,0.314,0.137,0.059)
9 = (0.529,0.353,0.069,0.029)
10 = (0.588,0.392,0,0)
[CRSP_SERIES]
0 = (0,0,0,0,0,0.273,0.14,0.123,0.15,0.294)
1 = (0.024,0.027,0.008,0.03,0.009,0.245,0.126,0.111,0.135,0.265)
2 = (0.048,0.054,0.016,0.061,0.017,0.218,0.112,0.099,0.12,0.235)
3 = (0.072,0.082,0.022,0.091,0.027,0.191,0.098,0.086,0.105,0.206)
4 = (0.096,0.109,0.03,0.122,0.035,0.164,0.084,0.074,0.09,0.176)
5 = (0.120,0.136,0.038,0.152,0.044,0.126,0.07,0.062,0.075,0.147)
6 = (0.143,0.163,0.047,0.182,0.053,0.109,0.056,0.049,0.06,0.118)
7 = (0.167,0.190,0.055,0.213,0.061,0.082,0.042,0.037,0.045,0.088)
8 = (0.191,0.217,0.062,0.243,0.071,0.055,0.028,0.024,0.030,0.059)
9 = (0.215,0.245,0.069,0.274,0.079,0.027,0.014,0.013,0.015,0.029)
10 = (0.239,0.272,0.077,0.304,0.088,0,0,0,0,0)
[S&P_SERIES]
0 = (0,0,0,0,0.273,0.140,0.123,0.150,0.294)
1 = (0.048,0.011,0.03,0.009,0.245,0.126,0.111,0.135,0.265)
2 = (0.097,0.021,0.061,0.017,0.218,0.112,0.099,0.12,0.235)
3 = (0.145,0.031,0.091,0.027,0.191,0.098,0.086,0.105,0.206)
4 = (0.194,0.041,0.0122,0.035,0.164,0.084,0.074,0.09,0.176)
5 = (0.242,0.052,0.152,0.044,0.136,0.07,0.062,0.075,0.147)
6 = (0.29,0.063,0.182,0.053,0.109,0.056,0.049,0.06,0.118)
7 = (0.339,0.073,0.213,0.061,0.082,0.042,0.037,0.045,0.088)
8 = (0.387,0.083,0.243,0.071,0.055,0.028,0.024,0.03,0.059)
9 = (0.436,0.093,0.274,0.079,0.027,0.014,0.013,0.015,0.029)
10 = (0.484,0.104,0.304,0.088,0,0,0,0,0)
[RUSSEL_SERIES]
0 = (0,0,0,0,0,0.273,0.14,0.123,0.15,0.294)
1 = (0.028,0.026,0.005,0.03,0.009,0.245,0.126,0.111,0.135,0.265)
2 = (0.056,0.052,0.01,0.061,0.017,0.218,0.112,0.099,0.086,0.105,0.206)
3 = (0.084,0.079,0.013,0.091,0.027,0.191,0.098,0.086,0.105,0.206)
4 = (0.112,0.105,0.018,0.122,0.035,0.164,0.084,0.074,0.09,0.176,0.02)
5 = (0.14,0.131,0.023,0.152,0.044,0.136,0.07,0.062,0.075,0.147)
6 = (0.168,0.157,0.028,0.182,0.053,0.109,0.056,0.0490.06,0.118)
7 = (0.196,0.184,0.032,0.213,0.061,0.082,0.042,0.037,0.045,0.088)
8 = (0.224,0.210,0.036,0.243,0.071,0.055,0.028,0.024,0.03,0.059)
9 = (0.252,0.236,0.041,0.274,0.079,0.027,0.014,0.013,0.015,0.029)
10 = (0.281,0.262,0.045,0.304,0.088,0,0,0,0,0)
[INCOME_SERIES]
0 = (0,0,0,0,0.171,0.515,0.294)
1 = (0.015,0.044,0.01,0.029,0.154,0.463,0.265)
2 = (0.03,0.088,0.019,0.059,0.137,0.412,0.235)
3 = (0.044,0.132,0.03,0.088,0.12,0.36,0.206)
4 = (0.059,0.176,0.039,0.118,0.103,0.309,0.176)
5 = (0.073,0.221,0.049,0.147,0.086,0.257,0.147)
6 = (0.088,0.265,0.059,0.176,0.068,0.206,0.118)
7 = (0.103,0.309,0.068,0.206,0.052,0.154,0.088)
8 = (0.117,0.353,0.079,0.235,0.034,0.103,0.059)
9 = (0.132,0.397,0.088,0.265,0.018,0.051,0.029)
10 = (0.147,0.441,0.098,0.294,0,0,0)
[TAX_SERIES]
1 = (0.024,0.027,0.008,0.03,0.009,0.882)
2 = (0.048,0.054,0.016,0.061,0.017,0.784)
3 = (0.072,0.082,0.022,0.091,0.027,0.686)
4 = (0.096,0.109,0.03,0.122,0.035,0.588)
5 = (0.12,0.136,0.038,0.152,0.044,0.49)
6 = (0.143,0.163,0.047,0.182,0.053,0.392)
7 = (0.167,0.190,0.055,0.213,0.061,0.294)
8 = (0.191,0.217,0.062,0.243,0.071,0.196)
9 = (0.215,0.245,0.069,0.274,0.079,0.098)
INI files are separated into sections, with a new section delineated by a section title in brackets. The section titles act much like a key in dictionaries, and is a useful way to organize inputs. This file organized each individual Vanguard universe as its own section. If you’d like to add your own universe, fork the gist, create a new section with the universe name in brackets, and list the information below it.
The robo-advisor’s algorithm parses input as a dictionary. While the INI file works much like a dictionary, we’ll still need to write a function to actually translate it into a dictionary.
def section_to_dict(section):
config = ConfigParser() ConfigParser()
config read('universe-config.ini')
out_dict = {}
for key in config[section]:
out_dict[int(key)] = ast..literal_eval(config[section][key])return(out_dict)
The configparser is initialized, and then reads the given INI file. The section that’s read is given as user input. If we refer back to the INI file, we’ll see that the values take the form as key = value. So like a dictionary, we can iterate through the deys of an INI section. In each case, each key/value pair is added to the dictionary. (Note: If you receive a key error, it’s because the path to your config.ini file is wrong
Two more things to be aware of. In the INI file, everything is a string, but our algorithm expects an integer/tuple key-value pairs. For the key, just casting them as integers is enough, but for the tuple, we’ll have to unstring the value. For that, I found ast.literal_evalto be the function that worked best.
Now, let’s integrate this new method of retrieving weight-based allocation with our robo-advisor algorithm:
from zipline.api import *
from configparser import ConfigParser
import datetime
import ast
def initialize(context):
print ("Starting the robo advisor")
core_series = symbols('VTI', 'VXUS', 'BND', 'BNDX')
crsp_series = symbols('VUG', 'VTV', 'VB', 'VEA', 'VWO', 'BSV', 'BIV', 'BLV', 'VMBS', 'BNDX')
sp_series = symbols('VOO', 'VXF', 'VEA', 'VWO', 'BSV', 'BIV', 'BLV', 'VMBS', 'BNDX')
russell_series = symbols('VONG', 'VONV', 'VTWO', 'VEA', 'VTWO', 'VEA', 'VWO', 'BSV', 'BIV', 'BLV', 'VMBS', 'BNDX')
income_series = symbols('VTI', 'VYM', 'VXUS', 'VYMI', 'BND', 'VTC', 'BNDX')
tax_series = symbols('VUG', 'VTV', 'VB', 'VEA', 'VWO', 'VTEB')
config = ConfigParser()
context.stocks = crsp_series
risk_based_allocation = section_to_dict('CRSP_SERIES', config)
#1-9 for Tax Efficient Series, 0-10 otherwise
risk_level = 5
if (risk_level not in risk_based_allocation):
raise Exception("Portfolio Doesn't Have Risk Level")
context.target_allocation = dict(zip(context.stocks, risk_based_allocation[risk_level]))
context.bought = False
schedule_function(
func=before_trading_starts,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_open(hours=1),
)
def handle_data(context, data):
if (context.bought == False):
print("Buying for the first time!")
for stock in context.stocks:
if (context.target_allocation[stock] == 0):
continue
amount = (context.target_allocation[stock] * context.portfolio.cash) / data.current(stock, 'price')
order(stock, int(amount))
print("Buying " + str(int(amount)) + " shares of " + str(stock))
context.bought = True
def before_trading_starts(context, data):
#total value of portfolio
value = context.portfolio.portfolio_value
#calculating current weights for each position
for stock in context.stocks:
if (context.target_allocation[stock] == 0):
continue
current_holdings = data.current(stock,'close') * context.portfolio.positions[stock].amount
weight = current_holdings/value
growth = float(weight) / float(context.target_allocation[stock])
#if weights of any position exceed threshold, trigger rebalance
if (growth >= 1.05 or growth <= 0.95):
rebalance(context, data)
break
print("No need to rebalance!")
def rebalance(context, data):
for stock in context.stocks:
current_weight = (data.current(stock, 'close') * context.portfolio.positions[stock].amount) / context.portfolio.portfolio_value
target_weight = context.target_allocation[stock]
distance = current_weight - target_weight
if (distance > 0):
amount = -1 * (distance * context.portfolio.portfolio_value) / data.current(stock,'close')
if (int(amount) == 0):
continue
print("Selling " + str(int(amount)) + " shares of " + str(stock))
order(stock, int(amount))
for stock in context.stocks:
current_weight = (data.current(stock, 'close') * context.portfolio.positions[stock].amount) / context.portfolio.portfolio_value
target_weight = context.target_allocation[stock]
distance = current_weight - target_weight
if (distance < 0):
amount = -1 * (distance * context.portfolio.portfolio_value) / data.current(stock,'close')
if (int(amount) == 0):
continue
print("Buying " + str(int(amount)) + " shares of " + str(stock))
order(stock, int(amount))
def section_to_dict(section, parser):
#change path to your ini file if running locally
parser.read('/home/robo-advisor/src/universe-config.ini')
out_dict = {}
for key in parser[section]:
out_dict[int(key)] = ast.literal_eval(parser[section][key])
return(out_dict)
I’ve listed all the symbols for every universe in the initialization. The weights for that universe are determined by calling section_to_dict on the appropriate section of the INI file.
Now, feel free to add as many universes as you like!
What’s Next?
Going from here, I’m interested in visualizing all the output data from the Zipline backtest. With my next post, I’m hoping to explore the different visualizations available from that raw data.