I have this dataset:
date event ticker initialprice finalprice
0 2010-11-18 MELA Mela Sciences FDA Panel MEL 0.0 0.0
1 2010-12-07 OREX Orexigen Therapeutics Inc PDUFA OREX 0.0 0.0
2 2010-12-07 SLXP SALIX PHARMACEUTICALS LTD PDUFA SLXP 0.0 0.0
3 2010-12-09 HGSI HUMAN GENOME SCIENCES INC PDUFA HGSI 0.0 0.0
4 2010-12-16 AZN Astrazeneca PDUFA AZN 0.0 0.0
and I would like to retrieve for each row the stock price for that exact date and for one week before that date, i wrote this code but breaks when i use timedelta:
import datetime
from datetime import timedelta
for index, row in fda.iterrows():
try:
ticker=row['ticker']
print(ticker)
end_date0=row['date'].strftime('%Y-%m-%d')
end_date1= row['date'] timedelta(days=1)
print(end_date1)
end_date1=end_date1.strftime('%Y-%m-%d')
start_date0=row['date'] - timedelta(days=7)
start_date1= row['date'] - timedelta(days=6)
start_date0=start_date0.strftime('%Y-%m-%d')
start_date1=start_date1.strftime('%Y-%m-%d')
yahoo_financials = YahooFinancials(ticker)
data = yahoo_financials.get_historical_price_data(start_date=end_date0,
end_date=end_date1,
time_interval='daily')
aapl_df = pd.DataFrame(data[ticker]['prices'])
price=float(aapl_df.iloc[0,4])
print(end_date0,end_date1,price)
row['finalprice']=float(price)
data = yahoo_financials.get_historical_price_data(start_date=start_date0,
end_date=start_date1,
time_interval='daily')
aapl_df = pd.DataFrame(data[ticker]['prices'])
price=float(aapl_df.iloc[0,4])
print(start_date0,start_date1,price)
row['initialprice']=float(price)
except:
pass
CodePudding user response:
Timedelta can't be add to a string you have to convert the string first to a datetime object:
# convert date to datetime obj, using the format in 2nd argument
date = datetime.datetime.strptime(date, '%Y-%m-%d')
# add 1 day
added_date = date timedelta(days=1)
# convert date back to string if needed
date = date.strftime('%Y-%m-%d')
CodePudding user response:
You can convert your date column from string to datetime with df.date = df.date.astype('datetime64')
and perform the date substraction with df.date - pd.tseries.offsets.Week()
.
Now, to achieve what you want "retrieve for each row the stock price for that exact date and for one week" you don't need to iterate over the dataframe.
try:
stock_column = 'ticker' # Name of the column that identifies each stock
df.date = df.date.astype('datetime64') # convert date column to datetime type
# Create a multiindex with your date minus 1 week and your stock
multiindex = pd.MultiIndex.from_arrays(
(
(df.date - pd.tseries.offsets.Week()),
df[stock_column]
)
)
# map the columns for initial price and final price for each stock 1 week before
df['initialprice_1_week_before'] = multiindex.map(
df.set_index(['date', stock_column])['initialprice']
)
df['finalprice_1_week_before'] = multiindex.map(
df.set_index(['date', stock_column])['finalprice']
)