Home > Net >  Create column based on pandas.DataFrame.between_time() without time be the index column
Create column based on pandas.DataFrame.between_time() without time be the index column

Time:11-27

I got a dataframe with a date/time in seconds, which I changed by:

df["start"]   = pd.to_datetime(df["start"], unit='s')
df["time"]    = df["start"].dt.time

Now I would like to add a column df["timeofday"], which include the time of day string.

0:00 - 5:59 night

6:00 - 11:59 morning

12:00 - 17:59 afternoon

18:00 - 21:59 evening

22:00 - 23:59 night

I assume that I need to use a for loop with between_time() on df.time. However, this does not work because I seem to need to use the column time as the index column of the dataframe. However, the dataframe has an index that I don't want to lose. Even if I could add a second index and then filter on each time period, it would not be clear to me how to insert the respective string into the new column timeofday.

I tried to filter like

df.time.between_time('02:00', '03:30')

Which leads to

TypeError: Index must be DatetimeIndex

So I assumed I need to set the time column as new index

df.set_index("time", inplace=True)
df["timeofday"] = 'night'
df["timeofday"][df.time.between_time('06:00', '11:59')] = "morning"

which leads to the same

TypeError: Index must be DatetimeIndex

After that I tried

df.set_index("start", inplace=True)
df["timeofday"] = 'night'
df["timeofday"][df.between_time('06:00', '11:59')] = "morning"

Leads to

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame InvalidIndexError

CodePudding user response:

Found a solution

df.set_index("start", inplace=True)
df["timeofday"] = 'night'
mask = df.between_time('06:00', '11:59')
df.loc[mask.index, 'timeofday'] = "morning"

CodePudding user response:

We can use pandas.DataFrame.loc and pandas.Series.between to accomplish this. Here is a full example.

Solution

import pandas as pd
from io import StringIO

# Example data with expected result so we can check our work later
input_data = """
start,expected_timeofday
2022-11-26 01:41:26,night
2022-11-26 03:13:06,night
2022-11-26 04:40:58,night
2022-11-26 06:07:06,morning
2022-11-26 06:27:14,morning
2022-11-26 06:28:16,morning
2022-11-26 07:34:46,morning
2022-11-26 10:01:44,morning
2022-11-26 13:45:08,afternoon
2022-11-26 15:40:36,afternoon
2022-11-26 15:59:00,afternoon
2022-11-26 16:51:03,afternoon
2022-11-26 17:15:42,afternoon
2022-11-26 18:24:02,evening
2022-11-26 18:34:37,evening
2022-11-26 19:21:00,evening
2022-11-26 19:41:17,evening
2022-11-26 21:53:10,evening
2022-11-26 23:16:29,night
2022-11-26 23:36:08,night
""".strip()

# Read example data from CSV-formatted string
df = pd.read_csv(StringIO(input_data), parse_dates=['start'])

class TimeOfDay():
    MORNING = 'morning'
    AFTERNOON = 'afternoon'
    EVENING = 'evening'
    NIGHT = 'night'

# Set `timeofday` category by using a filter on the
# hour property of the datetime column `start`.
df['timeofday'] = None
df.loc[df.start.dt.hour.between(0, 6, inclusive='left'), 'timeofday'] = TimeOfDay.NIGHT
df.loc[df.start.dt.hour.between(6, 12, inclusive='left'), 'timeofday'] = TimeOfDay.MORNING
df.loc[df.start.dt.hour.between(12, 18, inclusive='left'), 'timeofday'] = TimeOfDay.AFTERNOON
df.loc[df.start.dt.hour.between(18, 22, inclusive='left'), 'timeofday'] = TimeOfDay.EVENING
df.loc[df.start.dt.hour.between(22, 24, inclusive='left'), 'timeofday'] = TimeOfDay.NIGHT

# Check our work; raises an exception if we made a mistake
assert((df.timeofday == df.expected_timeofday).all())

# Result
print(df)

Result

                 start expected_timeofday  timeofday
0  2022-11-26 01:41:26              night      night
1  2022-11-26 03:13:06              night      night
2  2022-11-26 04:40:58              night      night
3  2022-11-26 06:07:06            morning    morning
4  2022-11-26 06:27:14            morning    morning
5  2022-11-26 06:28:16            morning    morning
6  2022-11-26 07:34:46            morning    morning
7  2022-11-26 10:01:44            morning    morning
8  2022-11-26 13:45:08          afternoon  afternoon
9  2022-11-26 15:40:36          afternoon  afternoon
10 2022-11-26 15:59:00          afternoon  afternoon
11 2022-11-26 16:51:03          afternoon  afternoon
12 2022-11-26 17:15:42          afternoon  afternoon
13 2022-11-26 18:24:02            evening    evening
14 2022-11-26 18:34:37            evening    evening
15 2022-11-26 19:21:00            evening    evening
16 2022-11-26 19:41:17            evening    evening
17 2022-11-26 21:53:10            evening    evening
18 2022-11-26 23:16:29              night      night
19 2022-11-26 23:36:08              night      night
  • Related