use yfinance to download historical data from yahoo finance

In today's digital age, leveraging technology to streamline financial analysis processes is not just a convenience; it's a necessity. Being able to read the three financial statements and calculate fundamental ratios is Finance 101 for any FP&A professional like me. However, instead of relying solely on manual Excel spreadsheets, we have a powerful ally at our disposal—Python. In this blog post, we will use Python library yfinance to automate downloading/scraping historical data from Yahoo Finance, then use other standard libraries like pandas and seaborn to standardize formats, calculate related metrics, and create charts.

  1. Explore yfinance

yfinance is a free API to download data from Yahoo Finance created by Ran Aroussi. Its functions are pretty straightforward and beginner-friendly.

Full documentation can be found here: https://pypi.org/project/yfinance/

First, we need to install yfinance if this is the first time you use the library. Then we load all frequently used libraries

# install new package for the first time
#pip install yfinance

# import libraries
import yfinance as yf
import pandas as pd
import datetime
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mpdates

# Set the global display format for numbers with commas
pd.options.display.float_format = '{:,.0f}'.format

Using the yf.Ticker(ticker_name) function, we can take a quick look into essential information about the company. Let’s use TSLA as an example

# Look at Tesla info
Stock_Ticker = 'TSLA'
Tesla = yf.Ticker(Stock_Ticker)
Tesla.info

Using function {ticker}.quarterly_financials we can take a look at the last 4 quarters data

# Get last 4 quarterly financial info
Tesla.quarterly_financials

Using function {ticker}.income_stmt we can look at line items in the financial statement, yet only last 3 years data are available

Hello, World!

  1. Explore yfinance

yfinance is a free API to download data from Yahoo Finance created by Ran Aroussi. Its functions are pretty straightforward and beginner-friendly.

Full documentation can be found here: https://pypi.org/project/yfinance/

We will calculate some basic income statement ratios using 4 main line items: Revenue, Gross Profit, EBITDA, Net Income

# Get historical financial data for the last 3 years
tesla_financials = pd.DataFrame(Tesla.income_stmt.loc[['Total Revenue', 'Gross Profit', 'EBITDA', 'Net Income']])

# Transpose features to columns to manipulate data more easily
tesla_financials = tesla_financials.transpose()

tesla_financials
# Put numbers in thousands format and update columns titles
tesla_financials['Revenue (mil)'] = tesla_financials['Total Revenue'] / (10 ** 6)
tesla_financials['Gross Profit (mil)'] = tesla_financials['Gross Profit'] / (10 ** 6)
tesla_financials['EBITDA (mil)'] = tesla_financials['EBITDA'] / (10 ** 6)
tesla_financials['Net Income (mil)'] = tesla_financials['Net Income'] / (10 ** 6)

# Add a column for Gross Margin, EBITDA Margin, Net Profit Margin
tesla_financials['Gross Margin %'] = tesla_financials['Gross Profit'] / tesla_financials['Total Revenue'] * 100
tesla_financials['EBITDA Margin %'] = tesla_financials['EBITDA'] / tesla_financials['Total Revenue'] * 100
tesla_financials['Net Profit Margin %'] = tesla_financials['Net Income'] / tesla_financials['Total Revenue'] * 100

# Get a new dataframe
tesla_fin = tesla_financials[['Revenue (mil)', 'Gross Profit (mil)', 'EBITDA (mil)', 'Net Income (mil)', 
                                             'Gross Margin %', 'EBITDA Margin %', 'Net Profit Margin %' ]]
tesla_fin
# reset index
tesla_fin.reset_index(inplace=True)

# change column name from index to calendar year
tesla_fin.rename(columns={'index': 'Date'}, inplace=True)

# Convert index to datetime format
tesla_fin['Date'] = pd.to_datetime(tesla_fin['Date'])

# Sort the DataFrame by the 'Date' column in ascending order
tesla_fin = tesla_fin.sort_values(by='Date')

# Calculate YoY growth
tesla_fin['YoY Revenue Growth %'] = tesla_fin['Revenue (mil)'].pct_change() * 100
tesla_fin['YoY Gross Profit Growth %'] = tesla_fin['Gross Profit (mil)'].pct_change() * 100
tesla_fin['YoY EBITDA Growth %'] = tesla_fin['EBITDA (mil)'].pct_change() * 100
tesla_fin['YoY Rev Growth + EBITDA Margin'] = tesla_fin['YoY Revenue Growth %'] + tesla_fin['EBITDA Margin %']

tesla_fin

With impressive YoY revenue growth above 50%, TSLA deserves to be a hot (and hotly debated) stock to pay attention to

3. Analyze and Chart stock price performance

# Look at 2 years worth of stock price performance
end_date = datetime.datetime.now()
start_date = pd.to_datetime('01/01/2023')

# Download historical stock price data
stock_data = yf.download(Stock_Ticker, start=start_date, end=end_date)

# Reset index to have Date as a column
stock_data.reset_index(inplace=True)

# Take a look at data
stock_data.info()

Then we can add a combo chart to show both trading volume and closing price for each day of TSLA

# Create a Seaborn barplot for volume
fig, ax1 = plt.subplots(figsize=(10,5))
ax1.bar(x='Date', height='Volume', data=stock_data, color='gray', alpha=0.3, label='Volume')

# Create a secondary y-axis for volume
ax2 = ax1.twinx()
sns.lineplot(x='Date', y='Close', data=stock_data, color='blue', label='Closing Price', ax=ax2)

# Set labels and title
ax1.set_xlabel('Date')
ax2.set_ylabel('Closing Price', color='blue')
ax1.set_ylabel('Volume', color='gray')
plt.title(f'TSLA Stock Prices and Volume ({start_date.date()} to {end_date.date()})')

# Set major locator and formatter for x-axis (dates)
locator = mpdates.MonthLocator(bymonthday=1)
formatter = mpdates.DateFormatter('%Y-%m-%d')
ax1.xaxis.set_major_locator(locator)
ax1.xaxis.set_major_formatter(formatter)

# rotate the x-axis tick labels for readability
ax1.tick_params(axis='x', rotation=50)

# Show the plot
plt.show()

And we repeat the similar process for different stock tickers and other financial metrics in Income Statement, Balance Sheet, or Cashflow Statement

Notebook can be found in my Gitbub Repository:

https://github.com/ExcellentBee/Learning-Everyday/blob/main/Scrape%20Financial%20Data%20from%20Yahoo%20Finance.ipynb

Previous
Previous

Scrape financial data from polygon.io

Next
Next

Dynamic Selection Charts using Bokeh