I have a csv file which contains the ticker symbols for all the stocks listed on Nasdaq. Here is a link to that csv file. One can download it from there. There are more than 8000 stocks listed. Following is the code
import pandas as pd
import yfinance # pip install yfinance
tick_pd = pd.read_csv("/path/to/the/csv/file/nasdaq_screener_1654004691484.csv",
usecols = [0])
I have made a function which retrieves the historical stock prices for a ticker symbol. That function is as following:-
## function to be applied on each stock symbol
def appfunc(ticker):
A = yf.Ticker(ticker).history(period="max")
A["symbol"] = ticker
return A
And I apply this function to each row of the tick_pd
, the following way:-
hist_prices = tick_pd.apply(appfunc)
But this takes way too much time, way way too much time. I was hoping if someone could find a way with which I can retrieve this data quite quickly. Or if there is a way I could parallelize it. I am quite new to python
, so, I don't really know a lot of ways to do this.
Thanks in advance
CodePudding user response:
You can use yf.download
to download all tickers asynchronously::
tick_pd = pd.read_csv('nasdaq_screener_1654024849057.csv', usecols=[0])
df = yf.download(tick_pd['Symbol'].tolist(), period='max')
You can use threads
as parameter of yf.download
:
# Enable mass downloading (default is True)
df = yf.download(tick_pd['Symbol'].tolist(), period='max', threads=True)
# OR
# You can control the number of threads
df = yf.download(tick_pd['Symbol'].tolist(), period='max', threads=8)