I'm trying to convert a daily OHLC price of a stock to monthly with the following code:
def daily_to_monthly_for_one_stock(path):
df = pd.read_csv(path)
logic = {'<Open>' : 'first',
'<High>' : 'max',
'<Low>' : 'min',
'<Close>' : 'last',
'<Volume>': 'sum'}
df['<DTYYYYMMDD>'] = pd.to_datetime(df['<DTYYYYMMDD>'])
df = df.resample('MS', on='<DTYYYYMMDD>').agg(logic)
idx = df.reset_index().groupby(df.index.to_period('M'))['<DTYYYYMMDD>'].idxmin()
df.index = df.iloc[idx].index
return df
However, the monthly data I got always started with day 01, which were not always present in the original daily data.
<DTYYYYMMDD><Ticker><Open><High><Low><Close><Volume>
2010-07-01 AAA 10.6552 11.5960 9.6268 11.0490 2220000
2010-08-01 AAA 11.0490 16.5626 9.1018 16.5407 6579600
2010-09-01 AAA 16.5407 20.5665 11.1803 11.9898 14122900
2010-10-01 AAA 11.9898 12.8212 6.2575 7.0451 9518400
2010-11-01 AAA 7.0451 7.9859 5.1416 7.5921 9423700
In the second row, the date present in the daily data was 2010-08-02, not 2010-08-01. Can you show me a way to fix this?
CodePudding user response:
IIUC, add 'DTYYYYMMDD': 'first'
to your logic
dict.
Example:
>>> df
DTYYYYMMDD Open High
0 2010-08-02 10 11
1 2010-08-03 13 14
2 2010-08-04 15 16
>>> logic
{'DTYYYYMMDD': 'first', 'Open': 'first', 'High': 'max'}
>>> df.resample('MS', on='DTYYYYMMDD').agg(logic)
DTYYYYMMDD Open High
DTYYYYMMDD
2010-08-01 2010-08-02 10 16
CodePudding user response:
Thanks to @corralien, I was able to come up with a full working solution as follows, just leaving it here in case anyone needs it:
Reusable Function
def daily_to_monthly_for_one_stock(path, d='<DTYYYYMMDD>', o='<Open>', h='<High>', l='<Low>', c='<Close>', v='<Volume>'):
df = pd.read_csv(path)
df.index = pd.to_datetime(df[d], format='%Y-%m-%d')
logic = {d: 'first', o: 'first', h: 'max', l:'min', c: 'last', v: 'sum',}
df = df.resample('MS').agg(logic)
df.reset_index(drop=True, inplace=True)
return df
Testing
df = daily_to_monthly_for_one_stock3('./AAA.csv')
<DTYYYYMMDD> <Open> <High> <Low> <Close> <Volume>
0 2010-07-15 10.6552 11.5960 9.6268 11.0490 2220000
1 2010-08-02 11.0490 16.5626 9.1018 16.5407 6579600
2 2010-09-01 16.5407 20.5665 11.1803 11.9898 14122900
3 2010-10-01 11.9898 12.8212 6.2575 7.0451 9518400
4 2010-11-01 7.0451 7.9859 5.1416 7.5921 9423700