Market data is a general term that encompasses a large range of information such as prices, trade sizes, and volume. These data can often be a crucial part of the analysis shaping one’s trading strategy. Therefore, it’s important to know how to source this data properly.
Alpaca provides easy-to-use SDKs that simplify the process of consuming market data in Python, Javascript, and more. We’ll be using their Python SDK that can be found in this GitHub repo. This article will explore the different historical data endpoints exposed, expand on the data response objects, and explain a simple use case.
The Different Tiers of Market Data
In the U.S., stock exchanges offer three tiers of market data: Level 1, Level 2, and Level 3. The higher tiers include all the features of lower tiers. Level 1 data includes price data such as open, high, low, close, volume, best bid and size, and best ask and size. Level 2 will add market depth information, which typically includes the best 5-10 bid and ask prices[1]. Level 3 adds even more depth than Level 2 and grants an investor the ability to enter or change quotes, execute orders, and send out confirmations of trades. These types of quotes are reserved for registered brokers and financial institutions[2]. Currently, Alpaca offers only Level 1 market data.
Getting Started with Alpaca
Before we’re able to access market data through Alpaca, we’ll need to get our API keys. To find those, you’ll need to sign up here and then follow this quick guide to navigate yourself to your keys. Now, you can instantiate the Alpaca Trade Client in your program by importing the package and using your keys to authenticate.
# Plotly imports
import plotly.graph_objects as go
import plotly.express as px
# Importing the api and instantiating the rest client according to our keys
import alpaca_trade_api as api
API_KEY = "<Your API Key>"
API_SECRET = "<Your Secret Key>"
alpaca = api.REST(API_KEY, API_SECRET)
What endpoints are available?
The base URL for all of the endpoints discussed in the following sections is https://data.alpaca.markets
, and all code snippets are based on the above code block which instantiates the Alpaca REST client as alpaca
. One thing to note is that each section will detail the endpoints when querying for a single stock at a time, but there are other types of queries that result in the same type of response. The response objects obtained from “Multi” and “Latest” type queries are analogous to their respective “Single” type query. Another thing relevant to these code snippets is that the default value for start
and end
, respectively, are the start of the current day and now.
Bars
Single Bars URL: GET/v2/stocks/{symbol}/bars
Multi Bars URL: GET/v2/stocks/bars
The bars API will yield the open, high, low, close, volume, number of trades, timestamp, and volume-weighted average price according to the parameters queried for. To get bar data we can use the get_bars()
method, which has two required parameters: symbol and timeframe. In this example, we grab the daily bar data for SPY within January 2021.
# Setting parameters before calling method
symbol = "SPY"
timeframe = "1Day"
start = "2021-01-01"
end = "2021-01-30"
# Retrieve daily bars for SPY in a dataframe and printing the first 5 rows
spy_bars = alpaca.get_bars(symbol, timeframe, start, end).df
print(spy_bars.head())
# open high ... trade_count vwap
# timestamp ...
# 2021-01-04 05:00:00+00:00 375.30 375.45 ... 623066 # 369.335676
# 2021-01-05 05:00:00+00:00 368.05 372.50 ... 338927 370.390186
# 2021-01-06 05:00:00+00:00 369.50 376.98 ... 575347 373.807251
# 2021-01-07 05:00:00+00:00 376.11 379.90 ... 366626 378.249233
# 2021-01-08 05:00:00+00:00 380.77 381.49 ... 391944 380.111637
The most common use case of bar data is when you’d like to visually represent a stock’s movement in a chart. Taking advantage of Plotly, we can use the open, high, low, and close columns to create a candlestick chart:
low, and close columns to create a candlestick chart:
# SPY bar data candlestick plot
candlestick_fig = go.Figure(data=[go.Candlestick(x=spy_bars.index,
open=spy_bars['open'],
high=spy_bars['high'],
low=spy_bars['low'],
close=spy_bars['close'])])
candlestick_fig.update_layout(
title="Candlestick chart for $SPY",
xaxis_title="Date",
yaxis_title="Price ($USD)")
candlestick_fig.show()
Getting Bars for multiple tickers at a time is very similar! We’ll reuse the variables above but change our symbol parameter. Instead of just a single string for our single ticker, we’ll query for Bars data using a list of strings for each ticker. This will route our request to the multi-bars URL, and should yield a dataframe that is three times longer, proportional to the number of symbols we’re querying for.
# Setting parameters before calling method
symbols = ["SPY", "TSLA", "AAPL"]
timeframe = "1Day"
start = "2021-01-01"
end = "2021-01-30"
# Retrieve daily bars for SPY, TSLA, and AAPL in a DataFrame
bars = alpaca.get_bars(symbols, timeframe, start, end).df
The response will be structured the same as the Bars response we saw above but will contain one more column called symbol
. This tells us what ticker that row of information belongs to. We can separate out the different tickers and print the two top rows by using the pandas .loc
method.
# Assigning new variables for each symbol contained in our response
spy_bars = bars.loc[bars["symbol"] == "SPY"]
tsla_bars = bars.loc[bars["symbol"] == "TSLA"]
aapl_bars = bars.loc[bars["symbol"] == "AAPL"]
print(spy_bars.head(2))
print(tsla_bars.head(2))
print(aapl_bars.head(2))
# open high ... vwap symbol
# timestamp ...
# 2021-01-04 05:00:00+00:00 375.30 375.45 ... 369.335676 SPY
# 2021-01-05 05:00:00+00:00 368.05 372.50 ... 370.390186 SPY
# [2 rows x 8 columns]
# open high ... vwap symbol
# timestamp ...
# 2021-01-04 05:00:00+00:00 720.00 744.4899 ... 731.118131 TSLA
# 2021-01-05 05:00:00+00:00 723.92 740.8400 ... 734.044099 TSLA
# [2 rows x 8 columns]
# open high ... vwap symbol
# timestamp ...
# 2021-01-04 05:00:00+00:00 133.56 133.6116 ... 129.732580 AAPL
# 2021-01-05 05:00:00+00:00 128.98 131.7400 ... 130.717944 AAPL
# [2 rows x 8 columns]
Trades
Single Trades URL: GET/v2/stocks/{symbol}/trades
Multi Trades URL: GET/v2/stocks/trades
Latest Trades URL: GET/v2/stocks/{symbol}/trades/latest
The trades API will respond with an array representing every trade that happened within your defined time interval. This includes the trade conditions, ID, price, size, timestamp, exchange, and tape. To make a request to this endpoint with the SDK, you can use the method get_trades()
. The only required parameter is the stock symbol. This example gets the trades for Apple on the current day, limiting the number of trades to 10000.
# Setting parameters before calling method
symbol = "AAPL"
limit = 10000
# Retrieve trades for Apple in a dataframe and printing the first 5 rows
aapl_trades = alpaca.get_trades(symbol, limit=limit).df
print(aapl_trades.head())
# exchange price ... id tape
# timestamp ...
# 2022-01-06 09:00:00.087492608+00:00 P 174.78 ... 1 C
# 2022-01-06 09:00:00.245505024+00:00 P 174.78 ... 2 C
# 2022-01-06 09:00:00.245579008+00:00 P 174.78 ... 3 C
# 2022-01-06 09:00:00.245615872+00:00 P 174.78 ... 4 C
# 2022-01-06 09:00:00.248960+00:00 K 174.89 ... 1 C
One application of trades data is observing what exchanges these orders are going through on. We can use Plotly to create and display a histogram that aggregates our data on exchange.
# AAPL trade exchange histogram
exchange_histogram = px.histogram(aapl_trades, x="exchange")
exchange_histogram.update_layout(
title="Frequency of exchanges in the first 10,000 trades of $AAPL on January 19, 2022",
yaxis_title="Number of trades",
xaxis_title="Exchange")
exchange_histogram.show()
Please note: All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns.
Quotes
Single Quotes URL: GET/v2/stocks/{symbol}/quotes
Multi Quotes URL: GET/v2/stocks/quotes
Latest Quotes URL: GET/v2/stocks/{symbol}/quotes/latest
The quotes API will yield the National Best Bid and Offer (NBBO), reporting the lowest ask price, size, and its exchange, the highest bid price, size, and its exchange, the quote conditions, a timestamp, and the tape. The method for getting quote data is get_quotes()
, with the only required parameter being the stock symbol. We’ll use start
and end
to constrain our quotes to a specific time interval, and limit
to set the maximum number of quotes.
# Setting parameters for method call
symbol = "SHOP"
start = "2021-01-04T14:30:00Z"
end = "2021-01-04T21:00:00Z"
limit = 1000
# Get Shopify quotes in a dataframe and print the first 5 rows
shop_quotes = alpaca.get_quotes(symbol, start, end, limit).df
print(shop_quotes.head())
# ask_exchange ask_price ... # conditions tape
# timestamp ...
# 2021-01-04 14:30:00.183900+00:00 M 1142.05 ... [?] A
# 2021-01-04 14:30:00.183900+00:00 V 1138.00 ... [?] A
# 2021-01-04 14:30:00.545500+00:00 M 1142.05 ... [?] A
# 2021-01-04 14:30:00.585800+00:00 M 1142.05 ... [?] A
# 2021-01-04 14:30:00.591000+00:00 M 1142.05 ... [?] A
Quotes data is useful if you want to see what the tightest spread looks like for a given stock over a time interval. We can see how that spread changes over time using Plotly. To get the spread, we’ll subtract the bid prices from the ask prices and plot the difference with respect to time on a line plot.
quotes_spread = shop_quotes["ask_price"] - shop_quotes["bid_price"]
spread_plot = px.line(shop_quotes, x=shop_quotes.index, y=quotes_spread)
spread_plot.update_layout(
title="Bid-ask spread of $SHOP as a function of time",
xaxis_title="Time of day",
yaxis_title="Bid-ask spread ($USD)"
)
spread_plot.show()
Snapshots
Single Snapshot URL: GET/v2/stocks/{symbol}/snapshot
Multi Snapshots URL: GET/v2/stocks/snapshots
You can think of the Snapshots API as a combination of all the previous endpoints. A response from this API is a snapshot object that contains key-value pairs for a given stock’s latest trade, latest quote, most recent 1-minute bar data, and the two most recent daily bars. The method for this endpoint is get_snapshot()
, and the only required parameter is the stock symbol.
symbol = "SPY"
snapshot = alpaca.get_snapshot(symbol=symbol)
You can read more about the properties of the response object inside the SDK repo, or by viewing the parsed version of the object below.
{
"symbol": "SPY",
"latestTrade": {
"t": "2022-01-04T21:49:04.055398242Z",
"x": "V",
"p": 477.38,
"s": 500,
"c": [
" ",
"T"
],
"i": 56592424269399,
"z": "B"
},
"latestQuote": {
"t": "2022-01-04T21:58:24.500619778Z",
"ax": "V",
"ap": 477.5,
"as": 5,
"bx": "V",
"bp": 477.37,
"bs": 5,
"c": [
"R"
],
"z": "B"
},
"minuteBar": {
"t": "2022-01-04T21:49:00Z",
"o": 477.38,
"h": 477.38,
"l": 477.38,
"c": 477.38,
"v": 500,
"n": 1,
"vw": 477.38
},
"dailyBar": {
"t": "2022-01-04T05:00:00Z",
"o": 479.21,
"h": 479.98,
"l": 475.62,
"c": 477.47,
"v": 1592654,
"n": 13044,
"vw": 477.803188
},
"prevDailyBar": {
"t": "2022-01-03T05:00:00Z",
"o": 476.32,
"h": 477.79,
"l": 473.855,
"c": 477.75,
"v": 1095730,
"n": 9213,
"vw": 476.613039
}
}
A great use case for the snapshots API is if you’re looking to check out the most recent price action in a stock. One way to do that is looking at the latest quote for a ticker and checking out the current spread. First, get the snapshot and quote data.
# Setting parameters for method call
symbol = "GME"
# Get GameStop market snapshot and print the latest quote
gme_snapshot = alpaca.get_snapshot(symbol)
latest_quote = gme_snapshot.latest_quote
print(latest_quote)
# QuoteV2({ 'ap': 127,
# 'as': 1,
# 'ax': 'V',
# 'bp': 105.05,
# 'bs': 1,
# 'bx': 'V',
# 'c': ['R'],
# 't': '2022-01-19T20:13:40.191542779Z',
# 'z': 'A'})
Now we can use Plotly to visually explain what the stock is currently being quoted for and what the spread is.
x = ["Ask Price", "Bid Price"]
y = [latest_quote.ap, latest_quote.bp]
quotes_figure = go.Figure(data=go.Bar(x=x, y=y))
quotes_figure.update_layout(
title="Bar graph of Bid Price and Ask Price for $GME as of January 19, 2022 at 20:13:40 UTC",
yaxis_title="Price ($USD)"
)
quotes_figure.show()
Conclusion
In this article, for each type of historical data API, we explored what they are, how to get a response, what’s contained in the response, and used Plotly to illustrate one simple use case for that API. We’ve also seen how the Alpaca Python client simplifies the process of accessing the data made available by these endpoints.
If you’re ready to take your knowledge to the next level, check out how you can utilize these endpoints and Plotly to create your own pairs trading strategy.
References
[1] A. Ganti, “Level 1 definition,” Investopedia, 21-Sep-2021. [Online]. Available: https://www.investopedia.com/terms/l/level1.asp. [Accessed: 07-Jan-2022].
[2] A. Hayes, “Level III quote,” Investopedia, 13-Sep-2021. [Online]. Available: https://www.investopedia.com/terms/l/level3.asp. [Accessed: 07-Jan-2022].
Please note that this article is for educational and informational purposes only All screenshots are for illustrative purposes only. Alpaca does not recommend any specific securities or investment strategies.
All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns. Keep in mind that while diversification may help spread risk it does not assure a profit, or protect against loss, in a down market. There is always the potential of losing money when you invest in securities, or other financial products. Investors should consider their investment objectives and risks carefully before investing.
Alpaca does not prepare, edit, or endorse Third Party Content. Alpaca does not guarantee the accuracy, timeliness, completeness or usefulness of Third Party Content, and is not responsible or liable for any content, advertising, products, or other materials on or available from third party sites.
Brokerage services are provided by Alpaca Securities LLC ("Alpaca"), member FINRA/SIPC, a wholly-owned subsidiary of AlpacaDB, Inc. Technology and services are offered by AlpacaDB, Inc.
This is not an offer, solicitation of an offer, or advice to buy or sell securities, or open a brokerage account in any jurisdiction where Alpaca is not registered (Alpaca is registered only in the United States).