Home > other >  Sum untill condition For all (Pandas)
Sum untill condition For all (Pandas)

Time:07-08

I want to sum holidays with following non holiday day. Then drop the holidays. How can i do that?

My pandas dataframe df ;

Date        Holiday   Value
01.01.2022     0       4
02.01.2022     1       0.5
03.01.2022     1       0.1
04.01.2022     0       0.2
05.01.2022     0       0.2
06.01.2022     0       0.4
07.01.2022     0       5
08.01.2022     1       0.3
09.01.2022     0       5

I want to see ;

Date        Holiday   Value
01.01.2022     0       4
04.01.2022     0       0.8 
05.01.2022     0       0.2
06.01.2022     0       0.4
07.01.2022     0       5
09.01.2022     0       5.3

Sum all the 1 (by order) then write to following 0

04.01.2022 -> 0.5 0.1 0.2

How can do that with pandas?

CodePudding user response:

you can compute groups ending on 0, then aggregate by last or sum:

group = df['Holiday'].rsub(1)[::-1].cumsum()
out = (df.groupby(group, sort=False, as_index=False)
         .agg({'Date': 'last', 'Holiday': 'last', 'Value': 'sum'})
       )

output:

         Date  Holiday  Value
0  01.01.2022        0    4.0
1  04.01.2022        0    0.8
2  05.01.2022        0    0.2
3  06.01.2022        0    0.4
4  07.01.2022        0    5.0
5  09.01.2022        0    5.3

CodePudding user response:

We can do cumsum with groupby

key = df['Holiday'].eq(0).iloc[::-1].cumsum()

out = df.groupby(key).agg({'Date':'last','Value':'sum'}).iloc[::-1]
#out['holiday'] = 0

CodePudding user response:

you can iterate through the dataframe using a counter and then keeping only the 0 rows

try:

counter = 0
for idx, row in df.iterrows():
   if row['Holiday'] == 1:
      counter  = row['Value']
   else:
      df['Value'][idx]  = counter
      counter = 0

df = df.loc[df['Holiday'] == 0]
  • Related