I have dates in sequential and some are not in sequence. How can I group those dates to 'From date' and 'To Date'?
Name Date
ABC Jan 1, 2022
ABC Jan 2, 2022
ABC Jan 3, 2022
ABC Feb 1, 2022
DEF Jan 1, 2022
DEF Mar 1, 2022
DEF Mar 2, 2022
This should group as
Name From To
ABC Jan 1, 2022 Jan 3, 2022
ABC Feb 1, 2022 Feb 1, 2022
DEF Jan 1, 2022 Jan 1, 2022
DEF Mar 1, 2022 Mar 2, 2022
This is just the reverse of dates.explode (frequency day) where all the dates between two dates are converted to list of dates, but here I want to group those to 'from and to date'.
CodePudding user response:
Doing diff
with cumsum
create the groupby
key
x = pd.to_datetime(df.Date).diff().dt.days.ne(1).cumsum()
out = df.groupby([df['Name'],x])['Date'].agg(['first','last']).reset_index(level=0)
Out[219]:
Name first last
Date
1 ABC Jan 1, 2022 Jan 3, 2022
2 ABC Feb 1, 2022 Feb 1, 2022
3 DEF Jan 1, 2022 Jan 1, 2022
4 DEF Mar 1, 2022 Mar 2, 2022