You've successfully subscribed to Alpaca Resources
Welcome back! You've successfully signed in.
Success! Your billing info is updated.
Billing info update failed.
• Home
• We're hiring
• Support
• Forum
• Docs

# Linear Regression For a Momentum Based Trading Strategy Using Zipline Trader

In this post, we will demonstrate how to create a simple pipeline that uses Linear Regression to identify stock momentum, and filters stocks with the strongest momentum indicator. Then, analyzes going long and short on stocks from this signal.  We will use Alphalens to analyze the quality of the Factor, then we will use pyfolio to analyze the returns from this factor. Eventually, we will create a simple zipline-trader algorithm that trades based on that signal and backtest it.

1). Loading the Data Bundle2). Calculating the Linear Regression Factor3). Creating the Pipeline4). Analyzing Performance Using Alphalens5). Working with pyfolio6). Backtesting our Alpha Factor7). Using pyfolio One More Time8). Final Thoughts

All the data used in this post is from the Alpaca data API which could be obtained with a free account.

Disclaimer: This is not a profitable strategy that you could deploy to live markets.  It's written as an instructional post, showing the strengths of this framework and what you could do with it.

Now, let's get things started :)

We use the Alpaca data service to create a data bundle that we feed into the zipline-trader's engine.

## Calculating the Linear Regression Factor

This is a factor that runs a linear regression over one year of stock log returns and calculates a "slope" as our factor. It is based on the Alphalens example library.

## Creating the Pipeline

Let's create a pipeline that:

1. Starts from our entire universe (S&P 500)
2. Calculates AverageDollarVolume for the past 30 days, and selects the top 20 stocks.
3. Calculate MyFactor for the 20 stocks selected in the previous step.

### Plot the pipeline

We can plot our pipeline to get a visual sense of what the process does

## Analyzing Performance Using Alphalens

Now we want to check if our factor has  the potential for alpha generation. We will use Alphalens.

### Data preparation

Alphalens input consists of two types of information: the factor values for the time period under analysis and the historical assets prices (or returns).
Alphalens doesn't need to know how the factor was computed, the historical factor values are enough. This is interesting because we can use the tool to evaluate factors for which we have the data but not the implementation details.

Alphalens requires that factor and price data follow a specific format and it provides a utility function, `get_clean_factor_and_forward_returns`, that accepts factor data, price data, and optionally group information (for example the sector groups, useful to perform sector specific analysis) and returns the data suitably formatted for Alphalens.

### Running Alphalens

Once the factor data is ready, running Alphalens analysis is pretty simple and it consists of one function call that generates the factor report (statistical information and plots). Please remember that it is possible to use the help python built-in function to view the details of a function.

``````al.tears.create_full_tear_sheet(factor_data, long_short=True, group_neutral=True, by_group=True)
``````

A Part of the Full Report below edited down for readability

#### These reports are also available

``````al.tears.create_returns_tear_sheet(factor_data,
long_short=True,
group_neutral=False,
by_group=False)

al.tears.create_information_tear_sheet(factor_data,
group_neutral=False,
by_group=False)

al.tears.create_turnover_tear_sheet(factor_data)

al.tears.create_event_returns_tear_sheet(factor_data, prices,
avgretplot=(5, 15),
long_short=True,
group_neutral=False,
std_bar=True,
by_group=False)
``````

## Working with pyfolio

We could use pyfolio to analyze the returns as if it was generated by a backtest, like so

A Part of the Full Report below edited down for readability

## Backtesting our Alpha Factor

Let's now create a simple algorithm that wraps our pipeline and backtest it against our data bundle as if we run it in a live market. This is more realistic than what we just did with pyfolio since we do not run a factor in live trading. There are a lot of moving parts and we need to wrap it in a logic that works under the market conditions.

Our simple algorithm will:

1. Run the pipeline we created daily.
2. Longs the top 5 stocks, Shorts the bottom 5.

### Backtest Execution

Let's now run our backtest for the year 2020

## Using pyfolio One More Time

We can use pyfolio once again to analyze the performance of the backtest we just execute

A Part of the Full Report below edited down for readability

## Final Thoughts

All and all we got pretty good results with positive returns. We did better than our benchmark (SPY) for the year 2020. So it could be a basis for creating something more robust.

### What next?

• We had significant drawdowns, one could minimize these.
• One can backtest during a much longer period.
• One can optimize the pipeline. The above example is just a simple setup, definitely not the optimized setup so different parameters would create different results.
• One could implement this in paper trading. While backtests are good, but it is important to see what happens in real time.
• The wrapping algorithm is extremely simplified, much more work could be done there.
• One could make the algorithm sector neutral, or maybe even find better responding sectors.