Home > Blockchain >  Create date/hour variable for each hour between two datetime variables
Create date/hour variable for each hour between two datetime variables

Time:11-12

I have a data table that looks like this:

ID       ARRIVAL_DATE_TIME          DISPOSITION_DATE
1           2021-11-07 08:35:00     2021-11-07 17:58:00
2           2021-11-07 13:16:00     2021-11-08 02:52:00
3           2021-11-07 15:12:00     2021-11-07 21:08:00

I want to be able to count the number of patients in our location by date/hour and hour. I imagine I would eventually have to transform this data into a format seen below and then create a pivot table, but I'm not sure how to first transform this data. So for example, ID 1 would have a row for each date/hour and hour between '2021-11-07 08:35:00' and '2021-11-07 17:58:00'.

ID            DATE_HOUR_IN_ED        HOUR_IN_ED
1           2021-11-07 08:00:00          8:00
1           2021-11-07 09:00:00          9:00
1           2021-11-07 10:00:00          10:00
1           2021-11-07 11:00:00          11:00
...
2           2021-11-07 13:00:00          13:00
2           2021-11-07 14:00:00          14:00
2           2021-11-07 15:00:00          15:00
....

CodePudding user response:

Use to_datetime with Series.dt.floor for remove times, then concat with repeat date_range and last create DataFrame by constructor:

df['ARRIVAL_DATE_TIME'] = pd.to_datetime(df['ARRIVAL_DATE_TIME']).dt.floor('H')

s = pd.concat([pd.Series(r.ID,pd.date_range(r.ARRIVAL_DATE_TIME, 
                                             r.DISPOSITION_DATE, freq='H')) 
              for r in df.itertuples()])

df1 = pd.DataFrame({'ID':s.to_numpy(), 
                    'DATE_HOUR_IN_ED':s.index, 
                    'HOUR_IN_ED': s.index.strftime('%H:%M')})

print (df1)
    ID     DATE_HOUR_IN_ED HOUR_IN_ED
0    1 2021-11-07 08:00:00      08:00
1    1 2021-11-07 09:00:00      09:00
2    1 2021-11-07 10:00:00      10:00
3    1 2021-11-07 11:00:00      11:00
4    1 2021-11-07 12:00:00      12:00
5    1 2021-11-07 13:00:00      13:00
6    1 2021-11-07 14:00:00      14:00
7    1 2021-11-07 15:00:00      15:00
8    1 2021-11-07 16:00:00      16:00
9    1 2021-11-07 17:00:00      17:00
10   2 2021-11-07 13:00:00      13:00
11   2 2021-11-07 14:00:00      14:00
12   2 2021-11-07 15:00:00      15:00
13   2 2021-11-07 16:00:00      16:00
14   2 2021-11-07 17:00:00      17:00
15   2 2021-11-07 18:00:00      18:00
16   2 2021-11-07 19:00:00      19:00
17   2 2021-11-07 20:00:00      20:00
18   2 2021-11-07 21:00:00      21:00
19   2 2021-11-07 22:00:00      22:00
20   2 2021-11-07 23:00:00      23:00
21   2 2021-11-08 00:00:00      00:00
22   2 2021-11-08 01:00:00      01:00
23   2 2021-11-08 02:00:00      02:00
24   3 2021-11-07 15:00:00      15:00
25   3 2021-11-07 16:00:00      16:00
26   3 2021-11-07 17:00:00      17:00
27   3 2021-11-07 18:00:00      18:00
28   3 2021-11-07 19:00:00      19:00
29   3 2021-11-07 20:00:00      20:00
30   3 2021-11-07 21:00:00      21:00

Alternative solution:

df['ARRIVAL_DATE_TIME'] = pd.to_datetime(df['ARRIVAL_DATE_TIME']).dt.floor('H')

L = [pd.date_range(s,e, freq='H') 
     for s, e in df[['ARRIVAL_DATE_TIME','DISPOSITION_DATE']].to_numpy()]
df['DATE_HOUR_IN_ED'] = L

df = (df.drop(['ARRIVAL_DATE_TIME','DISPOSITION_DATE'], axis=1)
        .explode('DATE_HOUR_IN_ED')
        .reset_index(drop=True)
        .assign(HOUR_IN_ED = lambda x: x['DATE_HOUR_IN_ED'].dt.strftime('%H:%M')))

CodePudding user response:

Try this:

import pandas as pd
import numpy as np
df = pd.read_excel('test.xls')
df1 = (df.set_index(['ID'])
   .assign(DATE_HOUR_IN_ED=lambda x: [pd.date_range(s,d, freq='H') 
                                     for s,d in zip(x.ARRIVAL_DATE_TIME, x.DISPOSITION_DATE)])
   ['DATE_HOUR_IN_ED'].explode()
  .reset_index()
)
df1['DATE_HOUR_IN_ED'] = df1['DATE_HOUR_IN_ED'].dt.floor('H')
df1['HOUR_IN_ED'] = df1['DATE_HOUR_IN_ED'].dt.strftime('%H:%M')

print(df1)

Output:

ID     DATE_HOUR_IN_ED HOUR_IN_ED
0    1 2021-11-07 08:00:00      08:00
1    1 2021-11-07 09:00:00      09:00
2    1 2021-11-07 10:00:00      10:00
3    1 2021-11-07 11:00:00      11:00
4    1 2021-11-07 12:00:00      12:00
5    1 2021-11-07 13:00:00      13:00
6    1 2021-11-07 14:00:00      14:00
7    1 2021-11-07 15:00:00      15:00
8    1 2021-11-07 16:00:00      16:00
9    1 2021-11-07 17:00:00      17:00
10   2 2021-11-07 13:00:00      13:00
11   2 2021-11-07 14:00:00      14:00
12   2 2021-11-07 15:00:00      15:00
13   2 2021-11-07 16:00:00      16:00
14   2 2021-11-07 17:00:00      17:00
15   2 2021-11-07 18:00:00      18:00
16   2 2021-11-07 19:00:00      19:00
17   2 2021-11-07 20:00:00      20:00
18   2 2021-11-07 21:00:00      21:00
19   2 2021-11-07 22:00:00      22:00
20   2 2021-11-07 23:00:00      23:00
21   2 2021-11-08 00:00:00      00:00
22   2 2021-11-08 01:00:00      01:00
23   2 2021-11-08 02:00:00      02:00
24   3 2021-11-07 15:00:00      15:00
25   3 2021-11-07 16:00:00      16:00
26   3 2021-11-07 17:00:00      17:00
27   3 2021-11-07 18:00:00      18:00
28   3 2021-11-07 19:00:00      19:00
29   3 2021-11-07 20:00:00      20:00
  • Related