Home > other >  Split Dataframe dates into individual min max date ranges by group
Split Dataframe dates into individual min max date ranges by group

Time:10-22

I have a dataframe which looks something like this:

S.No  date          origin  dest    journeytype
1     2021-10-21    FKG      HYM    OP
2     2021-10-21    FKG      HYM    PK
3     2021-10-21    HYM      LDS    OP
4     2021-10-22    FKG      HYM    OP
5     2021-10-22    FKG      HYM    PK
6     2021-10-22    HYM      LDS    OP
7     2021-10-23    FKG      HYM    OP
8     2021-10-24    AVM      BLA    OP
9     2021-10-24    AVM      DBL    OP
10    2021-10-27    AVM      BLA    OP

I need to split the individual origin, destination & journeytype into individual start & end_date columns.

Output dataframe for the above input should look like:

start_date  end_date   origin   dest    journeytype
2021-10-21  2021-10-23  FKG     HYM     OP
2021-10-21  2021-10-22  FKG     HYM     PK
2021-10-21  2021-10-22  HYM     LDS     OP
2021-10-24  2021-10-24  AVM     BLA     OP
2021-10-24  2021-10-24  AVM     DBL     OP
2021-10-27  2021-10-27  AVM     BLA     OP

Also if the date for any group is non-continuous they need to be shown as seperate records in the result

CodePudding user response:

If possible specified consecutive values by compare differencies if greater like 1 per groups use:

df['date'] = pd.to_datetime(df['date'])

g = df.groupby(['origin','dest','journeytype'])['date'].diff().dt.days.gt(1).cumsum()

df = (df.groupby(['origin','dest','journeytype', g], sort=False)['date']
        .agg(start_date='min', end_date='max')
        .reset_index())

df = df[['start_date', 'end_date','origin', 'dest', 'journeytype']]
print (df)
  start_date   end_date origin dest journeytype
0 2021-10-21 2021-10-23    FKG  HYM          OP
1 2021-10-21 2021-10-22    FKG  HYM          PK
2 2021-10-21 2021-10-22    HYM  LDS          OP
3 2021-10-24 2021-10-24    AVM  BLA          OP
4 2021-10-24 2021-10-24    AVM  DBL          OP
5 2021-10-27 2021-10-27    AVM  BLA          OP
  • Related