Home > Software engineering >  Is it possible to create a master dataframe for many companies from yfinance?
Is it possible to create a master dataframe for many companies from yfinance?

Time:11-07

My objectives are to:

  1. Get Yahoo finance OHLC (Open, High, Low, and Close) data into Postgres.
  2. Being able to update the data easily.
  3. Being able to easily add or remove tickers.

My current methodology:

  1. Create pandas dataframe.
  2. Dump data to .csv
  3. From Postgres COPY

ISSUE: I do not know how to create a dataframe for company A, then append (merge, join, concat, ECT) dataframe for other companies (~150 companies so far) and dump to .cvs.

Below are my actual code and a workaround that provides for the desired result but is clunky.

Let me know what you think.

ACTUAL (not working as expected)

import pandas as pd 
import yfinance as yf


tickers = ['VIR','PATH']
#ticker = ['VIR']
for ticker in tickers:
    df_yahoo = yf.download(ticker, 
                        #start='2000-01-01',
                        #end='2010-12-31',
                        progress='True')
    df = pd.DataFrame(df_yahoo)
    df.insert(0, 'TICKER', ticker)

file_name = "/Users/kevin/Dropbox/Programming/Python/test_data/deleteme.csv"
df.to_csv(file_name)

print(df) 

WORKAROUND (working)

import pandas as pd 
import yfinance as yf
import pickle


tickers = ['VIR']
#ticker = ['VIR']
for ticker in tickers:

    df_yahoo = yf.download(ticker, 
                        #start='2000-01-01',
                        #end='2010-12-31',
                        progress='True')
    df = pd.DataFrame(df_yahoo)
    df.insert(0, 'TICKER', ticker)

tickers = ['PATH']
#ticker = ['VIR']
for ticker in tickers:
    
    df_yahoo = yf.download(ticker, 
                        #start='2000-01-01',
                        #end='2010-12-31',
                        progress='True')
    df1 = pd.DataFrame(df_yahoo)
    df1.insert(0, 'TICKER', ticker)

frames = [df1, df]

result = pd.concat(frames)

file_name = "/Users/kevin/Dropbox/Programming/Python/test_data/deleteme.csv"
result.to_csv(file_name)

print(df) 

CodePudding user response:

Given what I think you want to accomplish, this is how I would do it:

# Create a function to load the data and create the frame
# Assumes len(tickers) >= 1
def build_df(tickers):
    df = pd.DataFrame(yf.download(tickers[0], 
                    #start='2000-01-01',
                    #end='2010-12-31',
                    progress='True'))
    df.insert(0, 'TICKER', tickers[0])
    for ticker in tickers[1:]:
        dx =  pd.DataFrame(yf.download(ticker, 
                    #start='2000-01-01',
                    #end='2010-12-31',
                    progress='True'))
        dx.insert(0, 'TICKER', ticker)
        df = pd.concat([df, dx])
    return df  

Then Call the function to assemble the desired DF as follows:

result = build_df(tickers)  

Finally, output the completed frame to CSV

file_name = "/Users/kevin/Dropbox/Programming/Python/test_data/deleteme.csv"
result.to_csv(file_name)
  • Related