Home > other >  Filter pandas column with current row values and sum another column to form a new column
Filter pandas column with current row values and sum another column to form a new column

Time:04-06

and I have the following dataset.

Date Area Value Delivery_Date
01-01-2021 ABC 10 02-01-2021
02-01-2021 BCD 20 04-01-2021
03-01-2021 ABC 15 04-01-2021
04-01-2021 BCD 25 05-01-2021
05-01-2021 ABC 15 06-01-2021

I have to create a new column named Sum which follows following conditions.

It should take the current row and should look for values from other rows which has same Area and date greater than current row date and delivery date greater than current row "date" as well. It should sum up the values from rows which meet these three conditions. So the resulting table will look like

Date Area Value Delivery_Date Sum
01-01-2021 ABC 10 02-01-2021 40
02-01-2021 BCD 20 04-01-2021 45
03-01-2021 ABC 15 04-01-2021 30
04-01-2021 BCD 25 05-01-2021 25
05-01-2021 ABC 15 06-01-2021 15

I am not sure how to start. Can someone help me?

CodePudding user response:

IIUC, use a GroupBy expanding.sum after sorting the data on the dates (recent to ancient):

# ensure datetime (although this format could be also sorted as string)
df['Date'] = pd.to_datetime(df['Date'])

df['sum'] = (df
 .sort_values(by='Date', ascending=False)      # reverse values
 .groupby(['Area'])['Value'].expanding().sum() # sum recent values
 .droplevel(0)
)

output:

        Date Area  Value   sum
0 2021-01-01  ABC     10  40.0
1 2021-02-01  BCD     20  45.0
2 2021-03-01  ABC     15  30.0
3 2021-04-01  BCD     25  25.0
4 2021-05-01  ABC     15  15.0
  • Related