Create a dataframe with two columns (hour and minute) using a datetime index-CodePudding

Hi all I have a list with Datetime indexes in it, with time interval 15 minutes (see screenshot)

I would like to create a dataframe with 2 columns only, the first with 'hour' and the second with 'minute' using this Datetime index. The result I would like to be like this:

hour   minute

08     00

08     15
08     30 etc.

Is it possible? thanks a lot!

CodePudding user response：

#sample data
L = [pd.Index([x]) for x in pd.date_range('2022-12-23 08:30', freq='15Min', periods=5)]
print (L)
[DatetimeIndex(['2022-12-23 08:30:00'], dtype='datetime64[ns]', freq=None),
 DatetimeIndex(['2022-12-23 08:45:00'], dtype='datetime64[ns]', freq=None),
 DatetimeIndex(['2022-12-23 09:00:00'], dtype='datetime64[ns]', freq=None), 
 DatetimeIndex(['2022-12-23 09:15:00'], dtype='datetime64[ns]', freq=None), 
 DatetimeIndex(['2022-12-23 09:30:00'], dtype='datetime64[ns]', freq=None)]

Use list comprehension for tuples for minutes and hours with flattening:

df = pd.DataFrame([(y.hour, y.minute) for x in L for y in x], columns=['hour','minute'])
print (df)
   hour  minute
0     8      30
1     8      45
2     9       0
3     9      15
4     9      30

Or:

idx = pd.Index([y for x in L for y in x])
print (idx)
DatetimeIndex(['2022-12-23 08:30:00', '2022-12-23 08:45:00',
               '2022-12-23 09:00:00', '2022-12-23 09:15:00',
               '2022-12-23 09:30:00'],
              dtype='datetime64[ns]', freq=None)

df = pd.DataFrame({'hour':idx.hour, 'minute':idx.minute})
print (df)
   hour  minute
0     8      30
1     8      45
2     9       0
3     9      15
4     9      30

If always one element index values is possible select first value by indexing:

df = pd.DataFrame([(x[0].hour, x[0].minute) for x in L], columns=['hour','minute'])
print (df)
   hour  minute
0     8      30
1     8      45
2     9       0
3     9      15
4     9      30

Or:

idx = pd.Index([x[0] for x in L])
print (idx)
DatetimeIndex(['2022-12-23 08:30:00', '2022-12-23 08:45:00',
               '2022-12-23 09:00:00', '2022-12-23 09:15:00',
               '2022-12-23 09:30:00'],
              dtype='datetime64[ns]', freq=None)

df = pd.DataFrame({'hour':idx.hour, 'minute':idx.minute})
print (df)
   hour  minute
0     8      30
1     8      45
2     9       0
3     9      15
4     9      30

CodePudding user response：

If you have your column you can create them using:

df['hour'] = df['timestamp'].apply(lambda x: x.hour)
df['minute'] = df['timestamp'].apply(lambda x: x.minute)

Or without lambda:

df['hour'] = df['timestamp'].dt.hour
df['minute'] = df['timestamp'].dt.minute

For example:

import pandas as pd
import datetime
x = [datetime.datetime.now()   datetime.timedelta(seconds=i*900) for i in range(10)]
df = pd.DataFrame(x, columns=['timestamp'])
df['hour'] = df['timestamp'].apply(lambda x: x.hour)
df['minute'] = df['timestamp'].apply(lambda x: x.minute)

result:

                   timestamp  hour  minute
0 2023-01-11 12:08:31.045411    12       8
1 2023-01-11 12:23:31.045411    12      23
2 2023-01-11 12:38:31.045411    12      38
3 2023-01-11 12:53:31.045411    12      53
4 2023-01-11 13:08:31.045411    13       8
5 2023-01-11 13:23:31.045411    13      23
6 2023-01-11 13:38:31.045411    13      38
7 2023-01-11 13:53:31.045411    13      53
8 2023-01-11 14:08:31.045411    14       8
9 2023-01-11 14:23:31.045411    14      23