<- "MSFT" ticker
Financial/Economic Data using TidyQuant
To gather and analyze financial data, we’ll explore using the tidyquant
package - which allows us to directly download data on measures like stock prices as well as economic indicators from the Fed.
Stock Data
Analyzing a company
Let’s see how we grab stock price data directly from the web and work with it. First we’ll assign the stock ticker/symbol for the company we’re interested in exploring.
Use the tq_get()
function to download the stock data.
This function returns a data frame containing the date, open, high, low, and close prices for each day. For example:
<- tq_get(ticker, get = "stock.prices", from = "2022-01-01")
stock_data
%>%
stock_data head(10) %>%
gt()
symbol | date | open | high | low | close | volume | adjusted |
---|---|---|---|---|---|---|---|
MSFT | 2022-01-03 | 335.35 | 338.00 | 329.78 | 334.75 | 28865100 | 328.7277 |
MSFT | 2022-01-04 | 334.83 | 335.20 | 326.12 | 329.01 | 32674300 | 323.0910 |
MSFT | 2022-01-05 | 325.86 | 326.07 | 315.98 | 316.38 | 40054300 | 310.6882 |
MSFT | 2022-01-06 | 313.15 | 318.70 | 311.49 | 313.88 | 39646100 | 308.2332 |
MSFT | 2022-01-07 | 314.15 | 316.50 | 310.09 | 314.04 | 32720000 | 308.3902 |
MSFT | 2022-01-10 | 309.49 | 314.72 | 304.69 | 314.27 | 44289500 | 308.6161 |
MSFT | 2022-01-11 | 313.38 | 316.61 | 309.89 | 314.98 | 29386800 | 309.3133 |
MSFT | 2022-01-12 | 319.67 | 323.41 | 317.08 | 318.27 | 34372200 | 312.5441 |
MSFT | 2022-01-13 | 320.47 | 320.88 | 304.00 | 304.80 | 45366000 | 299.3165 |
MSFT | 2022-01-14 | 304.25 | 310.82 | 303.75 | 310.20 | 39846400 | 304.6194 |
As you can see above, we can specify how far back we want the data to go. (You can also optionally set and end by using to =
… if you don’t, it just defaults to the most recent day.)
Now here’s where it gets even more interesting and powerful… Let’s say instead of daily prices, you wish you could look at it on a monthly basis. Or annually. Or weekly.
HI THERE THIS IS NEW STUFF!
Well you could write some custom code yourself aimed at pulling out just the records for the last day of the month – but you don’t have to come up with that: tidyquant has done it for you using its tq_transmute()
function. (The function uses the powers of other financial packages such as xts, quantmod and TTR under the hood.)
Modify our data to be monthly instead, based on the last closing price of the month.
<- stock_data %>%
stocks_monthly group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.monthly,
indexAt = "lastof")
%>%
stocks_monthly gt()
date | close |
---|---|
MSFT | |
2022-01-31 | 310.98 |
2022-02-28 | 298.79 |
2022-03-31 | 308.31 |
2022-04-30 | 277.52 |
2022-05-31 | 271.87 |
2022-06-30 | 256.83 |
2022-07-31 | 280.74 |
2022-08-31 | 261.47 |
2022-09-30 | 232.90 |
2022-10-31 | 232.13 |
2022-11-30 | 255.14 |
2022-12-31 | 239.82 |
2023-01-31 | 247.81 |
2023-02-28 | 249.42 |
2023-03-31 | 288.30 |
2023-04-30 | 307.26 |
2023-05-31 | 328.39 |
2023-06-30 | 340.54 |
2023-07-31 | 335.92 |
2023-08-31 | 327.76 |
2023-09-30 | 315.75 |
2023-10-31 | 338.11 |
2023-11-30 | 382.70 |
Want to try annually instead? It’s just a matter of one small tweak. Check it out…
%>%
stock_data group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.yearly, #here's the change
indexAt = "lastof") %>%
gt()
date | close |
---|---|
MSFT | |
2022-12-31 | 239.82 |
2023-11-30 | 382.70 |
Now of course, just a couple years of annual data isn’t very illuminating. But if we want to go back to start earlier in time, it’s as simple as just asking R for it.
<- tq_get(ticker, get = "stock.prices", from = "2000-01-01")
stock_data_from2000
# stock_data_from2000
%>%
stock_data_from2000 group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.yearly, #here's the change
indexAt = "lastof") %>%
gt()
date | close |
---|---|
MSFT | |
2000-12-31 | 21.6875 |
2001-12-31 | 33.1250 |
2002-12-31 | 25.8500 |
2003-12-31 | 27.3700 |
2004-12-31 | 26.7200 |
2005-12-31 | 26.1500 |
2006-12-31 | 29.8600 |
2007-12-31 | 35.6000 |
2008-12-31 | 19.4400 |
2009-12-31 | 30.4800 |
2010-12-31 | 27.9100 |
2011-12-31 | 25.9600 |
2012-12-31 | 26.7100 |
2013-12-31 | 37.4100 |
2014-12-31 | 46.4500 |
2015-12-31 | 55.4800 |
2016-12-31 | 62.1400 |
2017-12-31 | 85.5400 |
2018-12-31 | 101.5700 |
2019-12-31 | 157.7000 |
2020-12-31 | 222.4200 |
2021-12-31 | 336.3200 |
2022-12-31 | 239.8200 |
2023-11-30 | 382.7000 |
Keep in mind, depending on the use case, and as you get more comfortable with this, you can combine some of these steps together…
tq_get(ticker, get = "stock.prices", from = "2000-01-01") %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.yearly, #here's the change
indexAt = "lastof") %>%
gt()
date | close |
---|---|
MSFT | |
2000-12-31 | 21.6875 |
2001-12-31 | 33.1250 |
2002-12-31 | 25.8500 |
2003-12-31 | 27.3700 |
2004-12-31 | 26.7200 |
2005-12-31 | 26.1500 |
2006-12-31 | 29.8600 |
2007-12-31 | 35.6000 |
2008-12-31 | 19.4400 |
2009-12-31 | 30.4800 |
2010-12-31 | 27.9100 |
2011-12-31 | 25.9600 |
2012-12-31 | 26.7100 |
2013-12-31 | 37.4100 |
2014-12-31 | 46.4500 |
2015-12-31 | 55.4800 |
2016-12-31 | 62.1400 |
2017-12-31 | 85.5400 |
2018-12-31 | 101.5700 |
2019-12-31 | 157.7000 |
2020-12-31 | 222.4200 |
2021-12-31 | 336.3200 |
2022-12-31 | 239.8200 |
2023-11-30 | 382.7000 |
There are all kinds of other questions we can ponder, and then pull together using relatively straightforward functions (all things considered).
Let’s say now that we have data going back to 2000, we’d like to also calculate what the annual return was for our company’s stock. We can do that like this:
%>%
stock_data_from2000 tq_transmute(select = close,
mutate_fun = periodReturn,
period = "yearly",
col_rename = "annual_return") %>%
gt()
date | annual_return |
---|---|
2000-12-29 | -0.62788204 |
2001-12-31 | 0.52737752 |
2002-12-31 | -0.21962263 |
2003-12-31 | 0.05880079 |
2004-12-31 | -0.02374868 |
2005-12-30 | -0.02133232 |
2006-12-29 | 0.14187384 |
2007-12-31 | 0.19223033 |
2008-12-31 | -0.45393255 |
2009-12-31 | 0.56790117 |
2010-12-31 | -0.08431758 |
2011-12-30 | -0.06986746 |
2012-12-31 | 0.02889060 |
2013-12-31 | 0.40059907 |
2014-12-31 | 0.24164664 |
2015-12-31 | 0.19440255 |
2016-12-30 | 0.12004326 |
2017-12-29 | 0.37656907 |
2018-12-31 | 0.18739769 |
2019-12-31 | 0.55262378 |
2020-12-31 | 0.41039951 |
2021-12-31 | 0.51209428 |
2022-12-30 | -0.28692911 |
2023-11-28 | 0.59578017 |
Want to see monthly returns instead? It’s as simple as doing:
<- stock_data_from2000 %>%
monthlyreturns tq_transmute(select = close,
mutate_fun = periodReturn,
period = "monthly", #here's the change
col_rename = "monthly_return")
%>%
monthlyreturns head(12) %>%
gt()
date | monthly_return |
---|---|
2000-01-31 | -0.16032172 |
2000-02-29 | -0.08684547 |
2000-03-31 | 0.18881119 |
2000-04-28 | -0.34352941 |
2000-05-31 | -0.10304659 |
2000-06-30 | 0.27872128 |
2000-07-31 | -0.12734375 |
2000-08-31 | 0.00000000 |
2000-09-29 | -0.13607878 |
2000-10-31 | 0.14196891 |
2000-11-30 | -0.16696915 |
2000-12-29 | -0.24400871 |
Now keep in mind what we did above used the closing price of the stock. But we might want to take into account dividends, stock splits, etc., which can affect as the stock’s value. If we want to adjust for these things to achieve a potentially more accurate picture of the stock’s returns over time, we can use the adjusted field in the data instead.
%>%
stock_data_from2000 tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "yearly",
col_rename = "annual_return") %>%
gt()
date | annual_return |
---|---|
2000-12-29 | -0.627881867 |
2001-12-31 | 0.527377478 |
2002-12-31 | -0.219622773 |
2003-12-31 | 0.068164404 |
2004-12-31 | 0.091281124 |
2005-12-30 | -0.009382027 |
2006-12-29 | 0.158393275 |
2007-12-31 | 0.208428584 |
2008-12-31 | -0.443855863 |
2009-12-31 | 0.604671913 |
2010-12-31 | -0.065246702 |
2011-12-30 | -0.045156865 |
2012-12-31 | 0.057988987 |
2013-12-31 | 0.442979534 |
2014-12-31 | 0.275646158 |
2015-12-31 | 0.226918899 |
2016-12-30 | 0.150777440 |
2017-12-29 | 0.407306172 |
2018-12-31 | 0.207953072 |
2019-12-31 | 0.575580805 |
2020-12-31 | 0.425340626 |
2021-12-31 | 0.524769064 |
2022-12-30 | -0.280247495 |
2023-11-28 | 0.609929809 |
Visualizing
Want to visualize the returns? We can do that too, using the ggplot2 package augmented by tidyquant.
First let’s make sure we’ve saved our annual return dataset as a new named object.
<- stock_data_from2000 %>%
annualreturns_data tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "yearly", #here's the change
col_rename = "annual_return")
Now we’ll create the chart.
%>%
annualreturns_data ggplot(aes(x = year(date), y = annual_return)) +
geom_col() +
labs(title = "Annual Returns",
subtitle = "2000 through the present",
y = "Returns", x = "", color = "") +
scale_y_continuous(labels = scales::percent) +
coord_flip() +
theme_tq()
What other kinds of visualizations can we do? How about we create a line chart of the stock’s closing price over time. We can do this by using the geom_line()
function of ggplot2. To simplify we’ll use our original stock data from 2020. For example:
ggplot(stock_data, aes(x = date, y = adjusted)) +
geom_line()
If we wanted to add some labels…
%>%
stock_data ggplot(aes(x = date, y = adjusted)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "")
Now once again, we have some very helpful financial measures built in to tidyquant. Let’s say we’d like to smooth things out here a bit, and calculate a moving average? The geom_ma()
function delivers it for us.
# simple moving averages
%>%
stock_data ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 50, color = "red", linewidth = 1.25) +
theme_minimal()
Want a 30-day moving average instead? Well you just have make one tiny tweak.
%>%
stock_data ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 30, color = "red", linewidth = 1.25) + #can you spot the change in this line?
theme_minimal()
Maybe you’d acutally like to have both at the same time? No problem. We’ll distinguish the colors and line design type here to make it easier to see.
%>%
stock_data ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 30, color = "blue", linewidth = 1.25, linetype = "dotted") +
geom_ma(ma_fun = SMA, n = 50, color = "red", linewidth = 1.25) +
theme_minimal()
And remember once again, like we did earlier above, we could choose to look at weekly, monthly, or annual prices instead of daily.
Also, note that above we’re using a Simple Moving Average (SMA) for all of our analysis. But tidyquant also supports a range of other calculations, including:
Exponential moving averages (EMA) Weighted moving averages (WMA) Double exponential moving averages (DEMA) Zero-lag exponential moving averages (ZLEMA) Volume-weighted moving averages (VWMA) Elastic volume-weighted moving averages (EVWMA)
Multiple companies at once
You may be asking, could I grab data on more than one company, so I can compare them? Indeed.
<- tq_get(c("AAPL", "MSFT", "NYT", "XOM"),
mycompanies get = "stock.prices",
from = "2022-01-01")
%>%
mycompanies head() %>%
gt()
symbol | date | open | high | low | close | volume | adjusted |
---|---|---|---|---|---|---|---|
AAPL | 2022-01-03 | 177.83 | 182.88 | 177.71 | 182.01 | 104487900 | 179.9539 |
AAPL | 2022-01-04 | 182.63 | 182.94 | 179.12 | 179.70 | 99310400 | 177.6700 |
AAPL | 2022-01-05 | 179.61 | 180.17 | 174.64 | 174.92 | 94537600 | 172.9440 |
AAPL | 2022-01-06 | 172.70 | 175.30 | 171.64 | 172.00 | 96904000 | 170.0570 |
AAPL | 2022-01-07 | 172.89 | 174.14 | 171.03 | 172.17 | 86709100 | 170.2251 |
AAPL | 2022-01-10 | 169.08 | 172.50 | 168.17 | 172.19 | 106765600 | 170.2448 |
%>%
mycompanies count(symbol) %>%
gt()
symbol | n |
---|---|
AAPL | 479 |
MSFT | 479 |
NYT | 479 |
XOM | 479 |
Now we’ll chart those out to compare, using almost identical code as above, but with some changes to allow small-multiple charts using facet_wrap()
.
%>%
mycompanies ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "", y = "Closing Price", x = "") +
facet_wrap(~ symbol, ncol = 2, scale = "free_y")
Want to add that moving average again? Can do that, too.
%>%
mycompanies ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 50, color = "red", size = 1.25) +
facet_wrap(~ symbol, ncol = 2, scale = "free_y") +
theme_minimal()
Hmm, one of those is not like the other, right. What might some next steps be if we were going to follow our curiosity on this? Perhaps we could look at Exxon Mobil in the context of other competitors in the energy sector, etc.
Also, we can actually pull an entire index’s worth of companies at once. We’ll look at that a little later. For now, let’s shift to exploring economic indicator data we can capture.
FRED Economic Data
A wealth of economic data can be extracted from the Federal Reserve Economic Data (FRED) database. FRED contains thousands of data sets that are free to use. See the FRED categories to narrow down the data base and to get data codes. categories: https://fred.stlouisfed.org/categories
Let’s talk about them and FRED’s vast richness of data for business reporting.
In addition to the work we’re doing here in R, for example, there is an Excel plug-in for FRED data you may find useful as well. There’s even a FRED mobile app.
Examples
US Regular All Formulations Gas Price (GASREGW), weekly
<- tq_get("GASREGW", get = "economic.data", from = "2022-01-01")
gasprices
%>%
gasprices head(10) %>%
gt()
symbol | date | price |
---|---|---|
GASREGW | 2022-01-03 | 3.281 |
GASREGW | 2022-01-10 | 3.295 |
GASREGW | 2022-01-17 | 3.306 |
GASREGW | 2022-01-24 | 3.323 |
GASREGW | 2022-01-31 | 3.368 |
GASREGW | 2022-02-07 | 3.444 |
GASREGW | 2022-02-14 | 3.487 |
GASREGW | 2022-02-21 | 3.530 |
GASREGW | 2022-02-28 | 3.608 |
GASREGW | 2022-03-07 | 4.102 |
%>%
gasprices ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()
30 year mortgage rate average, weekly
<- tq_get("MORTGAGE30US", get = "economic.data", from = "2022-01-01") mortgate_30yr_weekly
%>%
mortgate_30yr_weekly ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()