I have a series of dates that only contains weekdays (no weekends)
from datetime import datetime as dtm
df=pd.DataFrame([dtm(2022,3,1),dtm(2022,3,2),dtm(2022,3,3),dtm(2022, 4, 15),dtm(2022,4,18),dtm(2022,4,19),dtm(2022,4,20)],columns=['dates'])
dates
0 2022-03-01
1 2022-03-02
2 2022-03-03
3 2022-04-15
4 2022-04-18
5 2022-04-19
6 2022-04-20
I want to create a dataframe where only start and end dates of consecutive WEEKDAY dates, using the above example, the desired output is
dates
0 2022-03-01
1 2022-03-03
2 2022-04-15
3 2022-04-20
CodePudding user response:
Use DataFrame.groupby
for difference between days if greater like 7
with aggregate min
and max
datetimes, then reshape by DataFrame.stack
:
df = (df.groupby((~df['dates'].diff().dt.days.le(7)).cumsum())['dates']
.agg(['min','max'])
.stack()
.reset_index(drop=True))
print (df)
0 2022-03-01
1 2022-03-03
2 2022-04-15
3 2022-04-20
dtype: datetime64[ns]