I have a panda dataframe df:
DIFF_HOURS DIFF_TEMP
0 0.0 0.050886
1 1.0 0.660698
2 2.0 1.656014
3 3.0 2.543857
4 4.0 3.071813
... ... ...
627647 68.0 -1.708911
627648 69.0 -1.225022
627649 70.0 -2.040668
627650 71.0 -2.738665
For data visualization, I do various boxplot with x=DIFF_HOURS and y=DIFF_TEMP.
I want to have subgroups of 6 hours:
Group 1: 0, 1, 2, 3,4,6
Group 2: 7,8,9,10,11,12
...
Group n: 66,67,68,69,70,71,72
And replace all values of each subgroup by min subgroup value:
Group 1: 0,0,0,0,0,0
Group 2: 7,7,7,7,7,7
...
Group n: 66,66,66,66,66,66
I don't want to use a loop. Is it possible to use panda apply() function please?
CodePudding user response:
Try this:
df.groupby(df['DIFF_HOURS'].mod(6).eq(0).cumsum())['DIFF_HOURS'].apply(lambda x: [x.min()]*6)
CodePudding user response:
My solution whick works fine and fast:
every_hours = 6
max_periode = 72
for i in range(0, max_periode, every_hours):
df.loc[(df['DIFF_HOURS'] > i) & (df['DIFF_HOURS'] <= (i 6)), 'DIFF_HOURS'] = i