Given a sample data as follows:
import pandas as pd
import numpy as np
np.random.seed(2021)
dates = pd.date_range('20130226', periods=90)
df = pd.DataFrame(np.random.uniform(0, 10, size=(90, 4)), index=dates,
columns=['A_values', 'B_values', 'C_values', 'target'])
df
Out:
A_values B_values C_values target
2013-02-26 6.059783 7.333694 1.389472 3.126731
2013-02-27 9.972433 1.281624 1.789931 7.529254
2013-02-28 6.621605 7.843101 0.968944 0.585713
2013-03-01 9.623960 6.165574 0.866300 5.612724
2013-03-02 6.165247 9.638430 5.743043 3.711608
... ... ... ...
2013-05-22 0.589729 6.479978 3.531450 6.872059
2013-05-23 6.279065 3.837670 8.853146 8.209883
2013-05-24 5.533017 5.241127 1.388056 5.355926
2013-05-25 1.596038 4.665995 2.406251 1.971875
2013-05-26 3.269001 1.787529 6.659690 7.545569
With code below, we can see the last row's index is outside of range of the initial date index (maximum date is 2013-05-26
):
df.groupby(pd.Grouper(freq='M')).last()
Out[177]:
A_values B_values C_values target
2013-02-28 6.621605 7.843101 0.968944 0.585713
2013-03-31 5.906967 8.545341 6.326550 8.684117
2013-04-30 5.358775 1.473809 5.231534 0.604810
2013-05-31 3.269001 1.787529 6.659690 7.545569
and:
df.groupby(df.index.to_period('M')).apply(lambda x: x.index.max())
Out[178]:
2013-02 2013-02-28
2013-03 2013-03-31
2013-04 2013-04-30
2013-05 2013-05-26
Freq: M, dtype: datetime64[ns]
But I hope to get an expected result as follows, how could I do that? Thanks.
A_values B_values C_values target
2013-02-28 6.621605 7.843101 0.968944 0.585713
2013-03-31 5.906967 8.545341 6.326550 8.684117
2013-04-30 5.358775 1.473809 5.231534 0.604810
2013-05-26 3.269001 1.787529 6.659690 7.545569 # date should be `2013-05-26` based on the original data
CodePudding user response:
Idea is create helper column from DatetimeIndex
and after last
convert column to index
:
df = (df.assign(new=df.index)
.groupby(pd.Grouper(freq='M'))
.last()
.set_index('new')
.rename_axis(None))
print (df)
A_values B_values C_values target
2013-02-28 6.621605 7.843101 0.968944 0.585713
2013-03-31 5.906967 8.545341 6.326550 8.684117
2013-04-30 5.358775 1.473809 5.231534 0.604810
2013-05-26 3.269001 1.787529 6.659690 7.545569