I have a data that has sleeping and waking time of each person as below.
id | current_time | asleep | wokeup |
---|---|---|---|
25 | 00:01 | 23:10 | 6:15 |
25 | 02:01 | 23:10 | 6:15 |
25 | 06:55 | 23:10 | 6:15 |
18 | 22:03 | 00:10 | 6:15 |
18 | 23:33 | 00:10 | 6:15 |
18 | 00:33 | 00:10 | 6:15 |
I would like to have a new column named wake. This column shoud be binary. If the current_time is between asleep and woketime, then wake value should be 0, otherwise 1. I expect to have a table like below.
id | current_time | asleep | wokeup | wake |
---|---|---|---|---|
25 | 00:01 | 23:10 | 6:15 | 0 |
25 | 02:01 | 23:10 | 6:15 | 0 |
25 | 06:55 | 23:10 | 6:15 | 1 |
18 | 22:03 | 00:10 | 6:15 | 1 |
18 | 23:33 | 00:10 | 6:15 | 1 |
18 | 00:33 | 00:10 | 6:15 | 0 |
My dataset is called df_clean2.
I attempted to solve it like below. Firstly (1), I assigned 1 value in wake column for all rows. Then (2), I wrote a boolean to change the column value of wake to 1 for given conditions
#(1)
df_clean2['wake'] = 0
#(2)
df_clean2['wake'] = np.where(((df_clean2.time < df_clean2.asleep) | (df_clean2.time > df_clean2.wokeup)), df_clean2.wake, 1)
But, this code does not change the wake column value for id 25 and current_time 06:55, I get wake value of 0 for my code. I also tried to put and condiiton instead of or, but also does not work. Can you help me to modify my code so that I have my output as expcted?
CodePudding user response:
This is tricky as the comparison is circular. You need to compare the current time to each bound but also the bounds themselves:
curr = pd.to_timedelta(df['current_time'] ':00')
s = pd.to_timedelta(df['asleep'] ':00')
w = pd.to_timedelta(df['wokeup'] ':00')
df['wake'] = ((curr.ge(s)==curr.ge(w))==s.le(w)).astype(int)
Output:
id current_time asleep wokeup wake
0 25 00:01 23:10 6:15 0
1 25 02:01 23:10 6:15 0
2 25 06:55 23:10 6:15 1
3 18 22:03 00:10 6:15 1
4 18 23:33 00:10 6:15 1
5 18 00:33 00:10 6:15 0