Home > OS >  Pandas divide time in different intervals
Pandas divide time in different intervals

Time:10-19

I have a dataframe in python pandas with date and time. I would like to assign an integer according to the predefined interval, for instance:

 Name   Date           Time
 F       01/01/22      23:50:00
 F1      01/01/22      22:00:00
 F2      01/01/22      20:00:00
 F3      01/01/22      19:00:00
 F4      01/01/22      18:00:00
 F5      01/01/22      17:00:00
 F6      01/01/22      16:00:00

I would like to obtain:

 Name   Date           Time       Interval
 F       01/01/22      23:50:00       1
 F1      01/01/22      22:00:00       1
 F2      01/01/22      20:00:00       2
 F3      01/01/22      19:00:00       2
 F4      01/01/22      18:00:00       3
 F5      01/01/22      17:00:00       3 
 F6      01/01/22      16:00:00       4

The interval should have an integer every 2 consecutive hours. The logic would be to have an integer with an hour in range [from 00:00:00 to 02:00:00]=1, [from 02:00:01 to 04:00:00]=2,[from 04:00:01 to 06:00:00]=3,[from 06:00:01 to 08:00:00]=4,etc...

is it possible in pandas?

Thanks

CodePudding user response:

Use cut with convert Time column to hours:

h = pd.to_datetime(df['Time']).dt.hour
df['Interval'] = pd.cut(h, bins=range(0,24,2), include_lowest=True, labels=False)   1
print (df)
   Name      Date      Time  Interval
0     F  01/01/22  00:50:00         1
1    F1  01/01/22  01:00:00         1
2    F2  01/01/22  02:00:00         1
3    F3  01/01/22  03:00:00         2
4    F4  01/01/22  04:00:00         2
5    F5  01/01/22  05:00:00         3
6    F6  01/01/22  06:00:00         3
7    F3  01/01/22  07:00:00         4
8    F4  01/01/22  08:00:00         4
9    F5  01/01/22  09:00:00         5
10   F6  01/01/22  10:00:00         5
11   F3  01/01/22  11:00:00         6
12   F4  01/01/22  12:00:00         6
13   F5  01/01/22  17:00:00         9
14   F6  01/01/22  16:00:00         8

Or:

h = pd.to_datetime(df['Time']).dt.hour
df['Interval'] = h.sub(1).clip(lower=0) // 2   1

CodePudding user response:

IIUC, you can use a double groupby:

# group by consecutive hours
g1 = pd.to_datetime(df['Time']).dt.hour.diff().abs().ne(1).cumsum()
# split the consecutive in groups of 2
g2 = g1.groupby(g1).cumcount().floordiv(2)

df['Interval'] = df.groupby([g1, g2]).ngroup().add(1)

output:

  Name      Date      Time  Interval
0    F  01/01/22  23:50:00         1
1   F1  01/01/22  22:00:00         1
2   F2  01/01/22  20:00:00         2
3   F3  01/01/22  19:00:00         2
4   F4  01/01/22  18:00:00         3
5   F5  01/01/22  17:00:00         3
6   F6  01/01/22  16:00:00         4
  • Related