I have this df:
Index Dates
0 2017-01-01 23:30:00
1 2017-01-12 22:30:00
2 2017-01-20 13:35:00
3 2017-01-21 14:25:00
4 2017-01-28 22:30:00
5 2017-08-01 13:00:00
6 2017-09-26 09:39:00
7 2017-10-08 06:40:00
8 2017-10-04 07:30:00
9 2017-12-13 07:40:00
10 2017-12-31 14:55:00
I am trying to select a time range from 5:00:00
to 11:59:00
in the morning, in all months:
df_new=df['Dates'].between(('2017-01-01 5:00:00'),('2017-12-31 11:59:00'))
The line above only gives me True
results, except for the last data. I understand that the error is because I am taking all the dates except the last one. How should I fix the code to have True
values for the range of hours I said above?
CodePudding user response:
IIUC you need to work with time
, not the whole dates
range:
import datetime
times = df.Dates.dt.time
df[(datetime.time(5) < times) & ( times < datetime.time(11, 59))]
CodePudding user response:
You would need to extract only the time part and use leading zeros in the strings:
m = (pd.to_datetime(df['Dates']).dt.strftime('%H:%M:%S')
.between('05:00:00','11:59:00')
)
out = df[m]
output:
Index Dates
6 6 2017-09-26 09:39:00
7 7 2017-10-08 06:40:00
8 8 2017-10-04 07:30:00
9 9 2017-12-13 07:40:00
NB. to be able to use between_time
you would need a datetime index:
df['Dates'] = pd.to_datetime(df['Dates'])
df.set_index('Dates').between_time('05:00:00','11:59:00').reset_index()
CodePudding user response:
Use the between_time
function in pandas.
df_new=df['Dates'].between_time('5:00:00','11:59:00')
here is the official documentation of it.