I have data that i want to resample use end of month based on original df but when i use df.resample('M').last(). the end of month date that i got is different from original df. see the asterix marks. 2005-12-31 should be >> 2005-12-29. any suggestion ? what parameter should i add into .resample() ?
orginal df =
DATE
2005-12-27 1161.707
2005-12-28 1164.143
*2005-12-29 1162.635*
2006-01-02 1171.709
2006-01-03 1184.690
2006-01-04 1211.699
test_resample = df.resample('M').last()
DATE
2005-11-30 1096.641
*2005-12-31 1162.635*
2006-01-31 1232.321
CodePudding user response:
Example
data = {'2005-12-27': 1161.707,
'2005-12-28': 1164.143,
'2005-12-29': 1162.635,
'2006-01-02': 1171.709,
'2006-01-03': 1184.69,
'2006-01-04': 1211.699}
s = pd.Series(data)
s.index = pd.to_datetime(s.index)
output(s
):
2005-12-27 1161.707
2005-12-28 1164.143
2005-12-29 1162.635
2006-01-02 1171.709
2006-01-03 1184.690
2006-01-04 1211.699
dtype: float64
Code
s.groupby(s.index.to_period('M')).tail(1)
output:
2005-12-29 1162.635
2006-01-04 1211.699
dtype: float64
If s
is not sorted by time order, sort index
CodePudding user response:
You can't directly with resample
, you should instead groupby.agg
after temporarily resetting the index:
(df.reset_index()
.groupby(df.index.to_period('M'))
.agg({'DATE': 'last', 'value': 'last'})
.set_index('DATE')
)
Output:
value
DATE
2005-12-29 1162.635
2006-01-04 1211.699