I have a question about selecting a range in a pandas DataFrame in Python. I have a column with times and a column with values. I would like to select all the rows with times between 6 a.m. and 6 p.m. (so from 6:00:00 to 18:00:00). I've succeeded at selecting all the night times (between 18:00:00 and 6:00:00), but if I apply the same to the day times, it doesn't work. Is there something wrong with my syntax? Below is a minimal working example. timeslice2 returns an empty DataFrame in my case.
import pandas as pd
times = ("1:00:00", "2:00:00", "3:00:00", "4:00:00", "5:00:00", "6:00:00", "7:00:00", "8:00:00", "9:00:00", \
"10:00:00", "11:00:00", "12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", \
"18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00")
values = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
data = zip(times, values)
colnames = ["Time", "values"]
df = pd.DataFrame(data=data, columns=colnames)
print(df)
# selecting only night times
timeslice1 = df[(df['Time'] > '18:00:00') & (df['Time'] <= '6:00:00')]
# selecting only day times
timeslice2 = df[(df['Time'] > '6:00:00') & (df['Time'] <= '18:00:00')]
print(timeslice1)
print(timeslice2)
I've been able to select the right range with this answer, but it seems strange to me that the above doesn't work. Moreover, if I convert my 'Time' column to 'datetime', as needed, it uses the date of today and I don't want that.
CodePudding user response:
This way it works, the first range if treated like datetime has no results because it will mean two different dates (days) in chronological order.
import pandas as pd
times = ("1:00:00", "2:00:00", "3:00:00", "4:00:00", "5:00:00", "6:00:00", "7:00:00", "8:00:00", "9:00:00", \
"10:00:00", "11:00:00", "12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", \
"18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00")
values = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
data = zip(times, values)
colnames = ["Time", "values"]
df = pd.DataFrame(data=data, columns=colnames)
print('Original df \n',df)
# selecting only night times
timeslice1 = df[(df['Time'] > '18:00:00') & (df['Time'] <= '6:00:00')]
# selecting only day times
#conver Time column to datetime
df['Time'] = pd.to_datetime(df['Time'])
timeslice2 = df[(df['Time'] > '6:00:00') & (df['Time'] <= '18:00:00')]
#convert df back to string
timeslice2["Time"] = timeslice2["Time"].dt.strftime('%H:%M:%S')
print('Slice 1 \n', timeslice1)
print('Slice 2 \n', timeslice2)