I have data with this date and time format below. I think the '.000Z' end portion is an offset for timezones. Sorry no repex as such.
2021-01-14T21:00:00.000Z
2021-01-14T22:00:00.000Z
2021-01-14T23:00:00.000Z
2021-01-15T00:00:00.000Z
2021-01-15T02:00:00.000Z
2021-01-15T03:00:00.000Z
2021-01-15T04:00:00.000Z
2021-01-15T05:00:00.000Z
Ideally I would like to filter rows of the df by hours and minutes.
I did extract just the hours/minutes/seconds, by using
pd.to_datetime(df['columnxyz).dt.date
which returns the following as dtype object
21:00:00
22:00:00
23:00:00
00:00:00
01:00:00
02:00:00
03:00:00
etc..
Then I tried using between_time to filter the rows with times I need.
df.between_time('10:15', '20:45')
No go! I get error 'Index must be DatetimeIndex' I thought that the earlier Pandas to_datetime function would give me a DatetimeIndex, but it just seems to give me an object.
I'm confused as to why it doesn't work. But now perhaps I should convert the hours/minute series above to a str, then split it, and convert it to an int. Though I'm sure there's a more elegant solution!
Any help appreciated!
CodePudding user response:
# create a column with the datetime type (and not just date part)
df['col']= pd.to_datetime(df['columnxyz'])
# set the index to datetime col
df=df.set_index(['col']
# apply filtering (your code)
df.between_time('10:15', '22:45').reset_index()
col columnxyz
0 2021-01-14 21:00:00 00:00 2021-01-14T21:00:00.000Z
1 2021-01-14 22:00:00 00:00 2021-01-14T22:00:00.000Z