Understanding Alpaca’s Market Data API with Pandas and Plotly

Alpaca provides easy-to-use SDKs that simplify the process of consuming market data in Python, Javascript, and more. This article will explore the different historical data endpoints exposed, expand on the data response objects, and explain a simple use case.

Understanding Alpaca’s Market Data API with Pandas and Plotly

Market data is a general term that encompasses a large range of information such as prices, trade sizes, and volume. These data can often be a crucial part of the analysis shaping one’s trading strategy. Therefore, it’s important to know how to source this data properly.

Alpaca provides easy-to-use SDKs that simplify the process of consuming market data in Python, Javascript, and more. We’ll be using their Python SDK that can be found in this GitHub repo. This article will explore the different historical data endpoints exposed, expand on the data response objects, and explain a simple use case.

The Different Tiers of Market Data

In the U.S., stock exchanges offer three tiers of market data: Level 1, Level 2, and Level 3. The higher tiers include all the features of lower tiers. Level 1 data includes price data such as open, high, low, close, volume, best bid and size, and best ask and size. Level 2 will add market depth information, which typically includes the best 5-10 bid and ask prices[1]. Level 3 adds even more depth than Level 2 and grants an investor the ability to enter or change quotes, execute orders, and send out confirmations of trades. These types of quotes are reserved for registered brokers and financial institutions[2]. Currently, Alpaca offers only Level 1 market data.

Getting Started with Alpaca

Before we’re able to access market data through Alpaca, we’ll need to get our API keys. To find those, you’ll need to sign up here and then follow this quick guide to navigate yourself to your keys. Now, you can instantiate the Alpaca Trade Client in your program by importing the package and using your keys to authenticate.

# Plotly imports 
import plotly.graph_objects as go
import plotly.express as px

# Importing the api and instantiating the rest client according to our keys
import alpaca_trade_api as api

API_KEY = "<Your API Key>"
API_SECRET = "<Your Secret Key>"
alpaca = api.REST(API_KEY, API_SECRET)

What endpoints are available?

The base URL for all of the endpoints discussed in the following sections is https://data.alpaca.markets, and all code snippets are based on the above code block which instantiates the Alpaca REST client as alpaca. One thing to note is that each section will detail the endpoints when querying for a single stock at a time, but there are other types of queries that result in the same type of response. The response objects obtained from “Multi” and “Latest” type queries are analogous to their respective “Single” type query. Another thing relevant to these code snippets is that the default value for start and end, respectively, are the start of the current day and now.

Bars

Single Bars URL: GET/v2/stocks/{symbol}/bars

Multi Bars URL: GET/v2/stocks/bars

The bars API will yield the open, high, low, close, volume, number of trades, timestamp, and volume-weighted average price according to the parameters queried for. To get bar data we can use the get_bars() method, which has two required parameters: symbol and timeframe. In this example, we grab the daily bar data for SPY within January 2021.

# Setting parameters before calling method
symbol = "SPY"
timeframe = "1Day"
start = "2021-01-01"
end = "2021-01-30"
# Retrieve daily bars for SPY in a dataframe and printing the first 5 rows
spy_bars = alpaca.get_bars(symbol, timeframe, start, end).df
print(spy_bars.head())


#                            open    high  ...  trade_count        vwap
# timestamp                                  ...                         
# 2021-01-04 05:00:00+00:00  375.30  375.45  ...       623066  # 369.335676
# 2021-01-05 05:00:00+00:00  368.05  372.50  ...       338927  370.390186
# 2021-01-06 05:00:00+00:00  369.50  376.98  ...       575347  373.807251
# 2021-01-07 05:00:00+00:00  376.11  379.90  ...       366626  378.249233
# 2021-01-08 05:00:00+00:00  380.77  381.49  ...       391944  380.111637

The most common use case of bar data is when you’d like to visually represent a stock’s movement in a chart. Taking advantage of Plotly, we can use the open, high, low, and close columns to create a candlestick chart:

low, and close columns to create a candlestick chart:
# SPY bar data candlestick plot
candlestick_fig = go.Figure(data=[go.Candlestick(x=spy_bars.index,
               open=spy_bars['open'],
               high=spy_bars['high'],
               low=spy_bars['low'],
               close=spy_bars['close'])])
candlestick_fig.update_layout(
    title="Candlestick chart for $SPY",
    xaxis_title="Date",
    yaxis_title="Price ($USD)")
candlestick_fig.show()

Figure 1 - $SPY candlestick chart

Getting Bars for multiple tickers at a time is very similar! We’ll reuse the variables above but change our symbol parameter. Instead of just a single string for our single ticker, we’ll query for Bars data using a list of strings for each ticker. This will route our request to the multi-bars URL, and should yield a dataframe that is three times longer, proportional to the number of symbols we’re querying for.

# Setting parameters before calling method
symbols = ["SPY", "TSLA", "AAPL"]
timeframe = "1Day"
start = "2021-01-01"
end = "2021-01-30"
# Retrieve daily bars for SPY, TSLA, and AAPL in a DataFrame
bars = alpaca.get_bars(symbols, timeframe, start, end).df

The response will be structured the same as the Bars response we saw above but will contain one more column called symbol. This tells us what ticker that row of information belongs to. We can separate out the different tickers and print the two top rows by using the pandas .loc method.


# Assigning new variables for each symbol contained in our response
spy_bars = bars.loc[bars["symbol"] == "SPY"]
tsla_bars = bars.loc[bars["symbol"] == "TSLA"]
aapl_bars = bars.loc[bars["symbol"] == "AAPL"]
print(spy_bars.head(2))
print(tsla_bars.head(2))
print(aapl_bars.head(2))

#                              open    high  ...        vwap  symbol
# timestamp                                  ...                    
# 2021-01-04 05:00:00+00:00  375.30  375.45  ...  369.335676     SPY
# 2021-01-05 05:00:00+00:00  368.05  372.50  ...  370.390186     SPY

# [2 rows x 8 columns]
#                              open      high  ...        vwap  symbol
# timestamp                                    ...                    
# 2021-01-04 05:00:00+00:00  720.00  744.4899  ...  731.118131    TSLA
# 2021-01-05 05:00:00+00:00  723.92  740.8400  ...  734.044099    TSLA

# [2 rows x 8 columns]
#                              open      high  ...        vwap  symbol
# timestamp                                    ...                    
# 2021-01-04 05:00:00+00:00  133.56  133.6116  ...  129.732580    AAPL
# 2021-01-05 05:00:00+00:00  128.98  131.7400  ...  130.717944    AAPL

# [2 rows x 8 columns]

Trades

Single Trades URL: GET/v2/stocks/{symbol}/trades

Multi Trades URL: GET/v2/stocks/trades

Latest Trades URL: GET/v2/stocks/{symbol}/trades/latest

The trades API will respond with an array representing every trade that happened within your defined time interval. This includes the trade conditions, ID, price, size, timestamp, exchange, and tape. To make a request to this endpoint with the SDK, you can use the method get_trades(). The only required parameter is the stock symbol. This example gets the trades for Apple on the current day, limiting the number of trades to 10000.

# Setting parameters before calling method
symbol = "AAPL"
limit = 10000
# Retrieve trades for Apple in a dataframe and printing the first 5 rows
aapl_trades = alpaca.get_trades(symbol, limit=limit).df
print(aapl_trades.head())

#                                    exchange   price  ...  id tape
# timestamp                                             ...         
# 2022-01-06 09:00:00.087492608+00:00        P  174.78  ...   1    C
# 2022-01-06 09:00:00.245505024+00:00        P  174.78  ...   2    C
# 2022-01-06 09:00:00.245579008+00:00        P  174.78  ...   3    C
# 2022-01-06 09:00:00.245615872+00:00        P  174.78  ...   4    C
# 2022-01-06 09:00:00.248960+00:00           K  174.89  ...   1    C

One application of trades data is observing what exchanges these orders are going through on. We can use Plotly to create and display a histogram that aggregates our data on exchange.

# AAPL trade exchange histogram
exchange_histogram = px.histogram(aapl_trades, x="exchange")
exchange_histogram.update_layout(
    title="Frequency of exchanges in the first 10,000 trades of $AAPL on January 19, 2022",
    yaxis_title="Number of trades",
    xaxis_title="Exchange")
exchange_histogram.show()

Figure 2 - Exchange histogram for $AAPL trades

Please note: All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns.

Quotes

Single Quotes URL: GET/v2/stocks/{symbol}/quotes

Multi Quotes URL: GET/v2/stocks/quotes

Latest Quotes URL: GET/v2/stocks/{symbol}/quotes/latest

The quotes API will yield the National Best Bid and Offer (NBBO), reporting the lowest ask price, size, and its exchange, the highest bid price, size, and its exchange, the quote conditions, a timestamp, and the tape. The method for getting quote data is get_quotes(), with the only required parameter being the stock symbol. We’ll use start and end to constrain our quotes to a specific time interval, and limit to set the maximum number of quotes.



# Setting parameters for method call
symbol = "SHOP"
start = "2021-01-04T14:30:00Z"
end = "2021-01-04T21:00:00Z"
limit = 1000
# Get Shopify quotes in a dataframe and print the first 5 rows
shop_quotes = alpaca.get_quotes(symbol, start, end, limit).df
print(shop_quotes.head())


#                                ask_exchange  ask_price  ...  # conditions tape
# timestamp                                                 ...                 
# 2021-01-04 14:30:00.183900+00:00            M    1142.05  ...         [?]    A
# 2021-01-04 14:30:00.183900+00:00            V    1138.00  ...         [?]    A
# 2021-01-04 14:30:00.545500+00:00            M    1142.05  ...         [?]    A
# 2021-01-04 14:30:00.585800+00:00            M    1142.05  ...         [?]    A
# 2021-01-04 14:30:00.591000+00:00            M    1142.05  ...         [?]    A

Quotes data is useful if you want to see what the tightest spread looks like for a given stock over a time interval. We can see how that spread changes over time using Plotly. To get the spread, we’ll subtract the bid prices from the ask prices and plot the difference with respect to time on a line plot.

quotes_spread = shop_quotes["ask_price"] - shop_quotes["bid_price"]
spread_plot = px.line(shop_quotes, x=shop_quotes.index, y=quotes_spread)
spread_plot.update_layout(
    title="Bid-ask spread of $SHOP as a function of time",
    xaxis_title="Time of day",
    yaxis_title="Bid-ask spread ($USD)"
)
spread_plot.show()
Figure 3 - $SHOP bid-ask spread as a function of time

Snapshots

Single Snapshot URL: GET/v2/stocks/{symbol}/snapshot

Multi Snapshots URL: GET/v2/stocks/snapshots

You can think of the Snapshots API as a combination of all the previous endpoints. A response from this API is a snapshot object that contains key-value pairs for a given stock’s latest trade, latest quote, most recent 1-minute bar data, and the two most recent daily bars. The method for this endpoint is get_snapshot(), and the only required parameter is the stock symbol.

symbol = "SPY"
snapshot = alpaca.get_snapshot(symbol=symbol)

You can read more about the properties of the response object inside the SDK repo, or by viewing the parsed version of the object below.

{
    "symbol": "SPY",
    "latestTrade": {
        "t": "2022-01-04T21:49:04.055398242Z",
        "x": "V",
        "p": 477.38,
        "s": 500,
        "c": [
            " ",
            "T"
        ],
        "i": 56592424269399,
        "z": "B"
    },
    "latestQuote": {
        "t": "2022-01-04T21:58:24.500619778Z",
        "ax": "V",
        "ap": 477.5,
        "as": 5,
        "bx": "V",
        "bp": 477.37,
        "bs": 5,
        "c": [
            "R"
        ],
        "z": "B"
    },
    "minuteBar": {
        "t": "2022-01-04T21:49:00Z",
        "o": 477.38,
        "h": 477.38,
        "l": 477.38,
        "c": 477.38,
        "v": 500,
        "n": 1,
        "vw": 477.38
    },
    "dailyBar": {
        "t": "2022-01-04T05:00:00Z",
        "o": 479.21,
        "h": 479.98,
        "l": 475.62,
        "c": 477.47,
        "v": 1592654,
        "n": 13044,
        "vw": 477.803188
    },
    "prevDailyBar": {
        "t": "2022-01-03T05:00:00Z",
        "o": 476.32,
        "h": 477.79,
        "l": 473.855,
        "c": 477.75,
        "v": 1095730,
        "n": 9213,
        "vw": 476.613039
    }
}

A great use case for the snapshots API is if you’re looking to check out the most recent price action in a stock. One way to do that is looking at the latest quote for a ticker and checking out the current spread. First, get the snapshot and quote data.

# Setting parameters for method call
symbol = "GME"
# Get GameStop market snapshot and print the latest quote
gme_snapshot = alpaca.get_snapshot(symbol)
latest_quote = gme_snapshot.latest_quote

print(latest_quote)

# QuoteV2({   'ap': 127,
#     'as': 1,
#     'ax': 'V',
#     'bp': 105.05,
#     'bs': 1,
#     'bx': 'V',
#     'c': ['R'],
#     't': '2022-01-19T20:13:40.191542779Z',
#     'z': 'A'})

Now we can use Plotly to visually explain what the stock is currently being quoted for and what the spread is.

x = ["Ask Price", "Bid Price"]
y = [latest_quote.ap, latest_quote.bp]
quotes_figure = go.Figure(data=go.Bar(x=x, y=y))
quotes_figure.update_layout(
    title="Bar graph of Bid Price and Ask Price for $GME as of January 19, 2022 at 20:13:40 UTC",
    yaxis_title="Price ($USD)"
)
quotes_figure.show()
Figure 4 - Latest bid and ask price for $GME

Conclusion

In this article, for each type of historical data API, we explored what they are, how to get a response, what’s contained in the response, and used Plotly to illustrate one simple use case for that API. We’ve also seen how the Alpaca Python client simplifies the process of accessing the data made available by these endpoints.

If you’re ready to take your knowledge to the next level, check out how you can utilize these endpoints and Plotly to create your own pairs trading strategy.

References

[1] A. Ganti, “Level 1 definition,” Investopedia, 21-Sep-2021. [Online]. Available: https://www.investopedia.com/terms/l/level1.asp. [Accessed: 07-Jan-2022].

[2] A. Hayes, “Level III quote,” Investopedia, 13-Sep-2021. [Online]. Available: https://www.investopedia.com/terms/l/level3.asp. [Accessed: 07-Jan-2022].

Please note that this article is for educational and informational purposes only All screenshots are for illustrative purposes only. Alpaca does not recommend any specific securities or investment strategies.

All investments involve risk and the past performance of a security, or financial product does not guarantee future results or returns. Keep in mind that while diversification may help spread risk it does not assure a profit, or protect against loss, in a down market. There is always the potential of losing money when you invest in securities, or other financial products. Investors should consider their investment objectives and risks carefully before investing.

Alpaca does not prepare, edit, or endorse Third Party Content. Alpaca does not guarantee the accuracy, timeliness, completeness or usefulness of Third Party Content, and is not responsible or liable for any content, advertising, products, or other materials on or available from third party sites.

Brokerage services are provided by Alpaca Securities LLC ("Alpaca"), member FINRA/SIPC, a wholly-owned subsidiary of AlpacaDB, Inc. Technology and services are offered by AlpacaDB, Inc.

This is not an offer, solicitation of an offer, or advice to buy or sell securities, or open a brokerage account in any jurisdiction where Alpaca is not registered (Alpaca is registered only in the United States).