Home > Net >  Reduce to only row totalRevenue and rename the colunmn names in years using yahoo finance and pandas
Reduce to only row totalRevenue and rename the colunmn names in years using yahoo finance and pandas

Time:07-25

I try to scrape the yearly total revenues from yahoo finance using pandas and yahoo_fin by using the following code:

from yahoo_fin import stock_info as si
import yfinance as yf
import pandas as pd

tickers = ('AAPL', 'MSFT', 'IBM')

income_statements_yearly= [] #All numbers in thousands

for ticker in tickers:    
    income_statement = si.get_income_statement(ticker, yearly=True)    
    years = income_statement.columns
    income_statement.insert(loc=0, column='Ticker', value=ticker)
    for i in range(4):       
        #print(years[i].year)        
        income_statement.rename(columns = {years[i]:years[i].year}, inplace = True)
    income_statements_yearly.append(income_statement)
income_statements_yearly = pd.concat(income_statements_yearly)
income_statements_yearly

The result I get looks like:

enter image description here

enter image description here

I would like to create on that basis another dataframe revenues and reduce the dataframe to only the row totalRevenue instead of getting all rows and at the same time I would love to rename the columns 2021, 2020, 2019, 2018 to revenues_2021, revenues_2020, revenues_2019, revenues_2018.

The result shall look like:

df = pd.DataFrame({'Ticker': ['AAPL', 'MSFT', 'IBM'],
                   'revenues_2021': [365817000000, 168088000000, 57351000000],
                   'revenues_2020': [274515000000, 143015000000, 55179000000],
                   'revenues_2019': [260174000000, 125843000000, 57714000000],
                   'revenues_2018': [265595000000, 110360000000, 79591000000]})

How can I solve this in an easy and fast way?

Ty for your help in advance.

CodePudding user response:

CODE

revenues = income_statements_yearly.loc["totalRevenue"].reset_index(drop=True)
revenues.columns = ["Ticker"]   ["revenues_"   str(col) for col in revenues.columns if col != "Ticker"]

OUTPUT

            Ticker revenues_2021 revenues_2020 revenues_2019 revenues_2018
0            AAPL  365817000000  274515000000  260174000000  265595000000
1            MSFT  168088000000  143015000000  125843000000  110360000000
2             IBM   57351000000   55179000000   57714000000   79591000000
  • Related