Home > OS >  Take sum of last 7 days with datetime
Take sum of last 7 days with datetime

Time:12-20

My data looks like this:

    Date        AcuteLegs_1day  
74  2021-12-09  450.0   
75  2021-12-10  840.0   
76  2021-12-13  900.0   
77  2021-12-14  630.0   
78  2021-12-15  400.0   
79  2021-12-16  360.0   
80  2021-12-19  150.0   
81  2021-12-20  600.0   
82  2021-12-21  600.0   

I want a new column with the sum off this load off the last 7 days: Now I use:

df['AcuteLegs_7days']=df["AcuteLegs_1day"].rolling(7).sum()
df['AcuteLegs_7days']=df['AcuteLegs_7days'].shift(1)

This works nice. But it takes the sum from the last 7 rows. But because sometimes there is no data from a date the last seven rows is not the last 7 days.

What I want is if a date is missing, this will add 0 to the trainingload sum of 7 days.

CodePudding user response:

You can resample to create the missing day and fill with 0 before doing sum.

df = df.resample('D').sum().fillna(0)

CodePudding user response:

I am providing two approaches because your question seems little bit ambiguous:
First approach:

# Convert the Date column to a datetime type
df['Date'] = pd.to_datetime(df['Date'])

# Create a new column to store the sum of the last 7 days
df['AcuteLegs_7days'] = df.AcuteLegs_1day.rolling(7, min_periods=1).sum()
print(df)

Output:

        Date  AcuteLegs_1day  AcuteLegs_7days
0 2021-12-09           450.0            450.0
1 2021-12-10           840.0           1290.0
2 2021-12-13           900.0           2190.0
3 2021-12-14           630.0           2820.0
4 2021-12-15           400.0           3220.0
5 2021-12-16           360.0           3580.0
6 2021-12-19           150.0           3730.0
7 2021-12-20           600.0           3880.0
8 2021-12-21           600.0           3640.0

Second approach:

# Convert the Date column to a datetime type
df['Date'] = pd.to_datetime(df['Date'])

# Set the Date column as the index of the dataframe
df.set_index('Date', inplace=True)

# Resample the data to daily frequency, filling missing values with 0
df = df.resample('D').asfreq().fillna(0)
df['AcuteLegs_7days'] = df['AcuteLegs_1day'].rolling(7, min_periods=1).sum()

print(df)

Output:

            AcuteLegs_1day  AcuteLegs_7days
Date                                       
2021-12-09           450.0            450.0
2021-12-10           840.0           1290.0
2021-12-11             0.0           1290.0
2021-12-12             0.0           1290.0
2021-12-13           900.0           2190.0
2021-12-14           630.0           2820.0
2021-12-15           400.0           3220.0
2021-12-16           360.0           3130.0
2021-12-17             0.0           2290.0
2021-12-18             0.0           2290.0
2021-12-19           150.0           2440.0
2021-12-20           600.0           2140.0
2021-12-21           600.0           2110.0

CodePudding user response:

To compute the sum of the AcuteLegs_1day column over the last 7 days, including days with missing values and treating them as 0, you can use the resample method and the sum function.

df['AcuteLegs_7days'] = df.resample('7D', on='Date')['AcuteLegs_1day'].sum()
  • Related