Pandas - filtering hours in datetime data-CodePudding

I have data with this date and time format below. I think the '.000Z' end portion is an offset for timezones. Sorry no repex as such.

2021-01-14T21:00:00.000Z
2021-01-14T22:00:00.000Z
2021-01-14T23:00:00.000Z
2021-01-15T00:00:00.000Z
2021-01-15T02:00:00.000Z
2021-01-15T03:00:00.000Z
2021-01-15T04:00:00.000Z
2021-01-15T05:00:00.000Z

Ideally I would like to filter rows of the df by hours and minutes.

I did extract just the hours/minutes/seconds, by using

pd.to_datetime(df['columnxyz).dt.date

which returns the following as dtype object

21:00:00
22:00:00
23:00:00
00:00:00
01:00:00
02:00:00
03:00:00

etc..

Then I tried using between_time to filter the rows with times I need.

df.between_time('10:15', '20:45')

No go! I get error 'Index must be DatetimeIndex' I thought that the earlier Pandas to_datetime function would give me a DatetimeIndex, but it just seems to give me an object.

I'm confused as to why it doesn't work. But now perhaps I should convert the hours/minute series above to a str, then split it, and convert it to an int. Though I'm sure there's a more elegant solution!

Any help appreciated!

CodePudding user response：


# create a column with the datetime type (and not just date part)
df['col']= pd.to_datetime(df['columnxyz'])

# set the index to datetime col
df=df.set_index(['col']


# apply filtering (your code)
df.between_time('10:15', '22:45').reset_index()

                          col                  columnxyz
0   2021-01-14 21:00:00 00:00   2021-01-14T21:00:00.000Z
1   2021-01-14 22:00:00 00:00   2021-01-14T22:00:00.000Z