I want to sum holidays with following non holiday day. Then drop the holidays. How can i do that?
My pandas dataframe df ;
Date Holiday Value
01.01.2022 0 4
02.01.2022 1 0.5
03.01.2022 1 0.1
04.01.2022 0 0.2
05.01.2022 0 0.2
06.01.2022 0 0.4
07.01.2022 0 5
08.01.2022 1 0.3
09.01.2022 0 5
I want to see ;
Date Holiday Value
01.01.2022 0 4
04.01.2022 0 0.8
05.01.2022 0 0.2
06.01.2022 0 0.4
07.01.2022 0 5
09.01.2022 0 5.3
Sum all the 1 (by order) then write to following 0
04.01.2022 -> 0.5 0.1 0.2
How can do that with pandas?
CodePudding user response:
you can compute groups ending on 0, then aggregate by last or sum:
group = df['Holiday'].rsub(1)[::-1].cumsum()
out = (df.groupby(group, sort=False, as_index=False)
.agg({'Date': 'last', 'Holiday': 'last', 'Value': 'sum'})
)
output:
Date Holiday Value
0 01.01.2022 0 4.0
1 04.01.2022 0 0.8
2 05.01.2022 0 0.2
3 06.01.2022 0 0.4
4 07.01.2022 0 5.0
5 09.01.2022 0 5.3
CodePudding user response:
We can do cumsum
with groupby
key = df['Holiday'].eq(0).iloc[::-1].cumsum()
out = df.groupby(key).agg({'Date':'last','Value':'sum'}).iloc[::-1]
#out['holiday'] = 0
CodePudding user response:
you can iterate through the dataframe using a counter and then keeping only the 0 rows
try:
counter = 0
for idx, row in df.iterrows():
if row['Holiday'] == 1:
counter = row['Value']
else:
df['Value'][idx] = counter
counter = 0
df = df.loc[df['Holiday'] == 0]