Home > other >  Updating date values based on number of days between startdate and enddate
Updating date values based on number of days between startdate and enddate

Time:09-17

I have the following table and I need to re-calculate the StartDate based on some business rules that I was given. Specifically, when the StartDate of the succeeding session is within one calendar day (i.e., < 2 days) the sessions are combined, using the earlier StartDate and the later EndDate.

StartDate EndDate DaysBetweenSessions
2011-04-15 2011-04-21 3
2011-04-24 2011-04-27 3
2011-04-30 2011-05-01 0
2011-05-01 2011-05-06 5
2011-05-11 2011-05-15 1
2011-05-16 2011-05-17 0
2011-05-17 2011-05-25 10

What I need as a result:

StartDate EndDate DaysBetweenSessions NewStartDate NewEndDate
2011-04-15 2011-04-21 3 2011-04-15 2011-04-21
2011-04-24 2011-04-27 3 2011-04-24 2011-04-27
2011-04-30 2011-05-01 0 2011-04-30 2011-05-06
2011-05-01 2011-05-06 5 2011-04-30 2011-05-06
2011-05-11 2011-05-15 1 2011-05-11 2011-05-25
2011-05-16 2011-05-17 0 2011-05-11 2011-05-25
2011-05-17 2011-05-25 10 2011-05-11 2011-05-25

I have attempted to use df.shift to get the PreviousStartDate, PreviousEndDate, NextStartDate, NextEndDate, and then tried to loop through the dataset with df.iterrows(), calculating the new dates, but I run into a technical issue that I don't fully understand - while in the loop that was used to update a value, I cannot use an operator on that new value - the new value hasn't been written to the df yet.

CodePudding user response:

Try with groupby and tranform:

#convert to datetime if necessary
df["StartDate"] = pd.to_datetime(df["StartDate"])
df["EndDate"] = pd.to_datetime(df["EndDate"])

df["NewStartDate"] = df.groupby((df["DaysBetweenSessions"]>=2).cumsum().shift().fillna(0))["StartDate"].transform("min")
df["NewEndDate"] = df.groupby((df["DaysBetweenSessions"]>=2).cumsum().shift().fillna(0))["EndDate"].transform("max")

>>> df
  StartDate    EndDate  DaysBetweenSessions NewStartDate NewEndDate
0 2011-04-15 2011-04-21                    3   2011-04-15 2011-04-21
1 2011-04-24 2011-04-27                    3   2011-04-24 2011-04-27
2 2011-04-30 2011-05-01                    0   2011-04-30 2011-05-06
3 2011-05-01 2011-05-06                    5   2011-04-30 2011-05-06
4 2011-05-11 2011-05-15                    1   2011-05-11 2011-05-25
5 2011-05-16 2011-05-17                    0   2011-05-11 2011-05-25
6 2011-05-17 2011-05-25                   10   2011-05-11 2011-05-25
  • Related