I got a dataframe with a date/time in seconds, which I changed by:
df["start"] = pd.to_datetime(df["start"], unit='s')
df["time"] = df["start"].dt.time
Now I would like to add a column df["timeofday"], which include the time of day string.
0:00 - 5:59 night
6:00 - 11:59 morning
12:00 - 17:59 afternoon
18:00 - 21:59 evening
22:00 - 23:59 night
I assume that I need to use a for loop with between_time() on df.time. However, this does not work because I seem to need to use the column time as the index column of the dataframe. However, the dataframe has an index that I don't want to lose. Even if I could add a second index and then filter on each time period, it would not be clear to me how to insert the respective string into the new column timeofday.
I tried to filter like
df.time.between_time('02:00', '03:30')
Which leads to
TypeError: Index must be DatetimeIndex
So I assumed I need to set the time column as new index
df.set_index("time", inplace=True)
df["timeofday"] = 'night'
df["timeofday"][df.time.between_time('06:00', '11:59')] = "morning"
which leads to the same
TypeError: Index must be DatetimeIndex
After that I tried
df.set_index("start", inplace=True)
df["timeofday"] = 'night'
df["timeofday"][df.between_time('06:00', '11:59')] = "morning"
Leads to
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame InvalidIndexError
CodePudding user response:
Found a solution
df.set_index("start", inplace=True)
df["timeofday"] = 'night'
mask = df.between_time('06:00', '11:59')
df.loc[mask.index, 'timeofday'] = "morning"
CodePudding user response:
We can use pandas.DataFrame.loc
and pandas.Series.between
to accomplish this. Here is a full example.
Solution
import pandas as pd
from io import StringIO
# Example data with expected result so we can check our work later
input_data = """
start,expected_timeofday
2022-11-26 01:41:26,night
2022-11-26 03:13:06,night
2022-11-26 04:40:58,night
2022-11-26 06:07:06,morning
2022-11-26 06:27:14,morning
2022-11-26 06:28:16,morning
2022-11-26 07:34:46,morning
2022-11-26 10:01:44,morning
2022-11-26 13:45:08,afternoon
2022-11-26 15:40:36,afternoon
2022-11-26 15:59:00,afternoon
2022-11-26 16:51:03,afternoon
2022-11-26 17:15:42,afternoon
2022-11-26 18:24:02,evening
2022-11-26 18:34:37,evening
2022-11-26 19:21:00,evening
2022-11-26 19:41:17,evening
2022-11-26 21:53:10,evening
2022-11-26 23:16:29,night
2022-11-26 23:36:08,night
""".strip()
# Read example data from CSV-formatted string
df = pd.read_csv(StringIO(input_data), parse_dates=['start'])
class TimeOfDay():
MORNING = 'morning'
AFTERNOON = 'afternoon'
EVENING = 'evening'
NIGHT = 'night'
# Set `timeofday` category by using a filter on the
# hour property of the datetime column `start`.
df['timeofday'] = None
df.loc[df.start.dt.hour.between(0, 6, inclusive='left'), 'timeofday'] = TimeOfDay.NIGHT
df.loc[df.start.dt.hour.between(6, 12, inclusive='left'), 'timeofday'] = TimeOfDay.MORNING
df.loc[df.start.dt.hour.between(12, 18, inclusive='left'), 'timeofday'] = TimeOfDay.AFTERNOON
df.loc[df.start.dt.hour.between(18, 22, inclusive='left'), 'timeofday'] = TimeOfDay.EVENING
df.loc[df.start.dt.hour.between(22, 24, inclusive='left'), 'timeofday'] = TimeOfDay.NIGHT
# Check our work; raises an exception if we made a mistake
assert((df.timeofday == df.expected_timeofday).all())
# Result
print(df)
Result
start expected_timeofday timeofday
0 2022-11-26 01:41:26 night night
1 2022-11-26 03:13:06 night night
2 2022-11-26 04:40:58 night night
3 2022-11-26 06:07:06 morning morning
4 2022-11-26 06:27:14 morning morning
5 2022-11-26 06:28:16 morning morning
6 2022-11-26 07:34:46 morning morning
7 2022-11-26 10:01:44 morning morning
8 2022-11-26 13:45:08 afternoon afternoon
9 2022-11-26 15:40:36 afternoon afternoon
10 2022-11-26 15:59:00 afternoon afternoon
11 2022-11-26 16:51:03 afternoon afternoon
12 2022-11-26 17:15:42 afternoon afternoon
13 2022-11-26 18:24:02 evening evening
14 2022-11-26 18:34:37 evening evening
15 2022-11-26 19:21:00 evening evening
16 2022-11-26 19:41:17 evening evening
17 2022-11-26 21:53:10 evening evening
18 2022-11-26 23:16:29 night night
19 2022-11-26 23:36:08 night night