Home > Enterprise >  Pandas rolling datetime not accepting datetime offset
Pandas rolling datetime not accepting datetime offset

Time:03-09

My dataframe is the following:

My dataframe is presented above. The dtypes are

weekday               int64
date         datetime64[ns]
time                 object
customers             int64
dtype: object

I'd like to sum the customers column to be the count of customers arrived in the past 2 hours (stored in column date). However, using the Pandas Rolling functionality, I can only write

df['customers'] = df['date'].rolling(2).count()

This only counts the previous two date rows completely disregarding datetime values. I'd like to write

df['customers'] = df['date'].rolling('2H').count() #desired: 2H

to get the correct result. However, I'm getting ValueError: window must be an integer. Reading the rolling documentation from pandas, a datetime object should be able to receive a rolling time window (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html). I'm completely clueless why my datetime column cannot use this functionality.

CodePudding user response:

Create sorted DatetimeIndex:

df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date').sort_index()
df['customers'] = df['customers'].rolling('2H').count()
  • Related