Hi all I have a list with Datetime indexes in it, with time interval 15 minutes (see screenshot)
I would like to create a dataframe with 2 columns only, the first with 'hour' and the second with 'minute' using this Datetime index. The result I would like to be like this:
hour minute
08 00
08 15
08 30 etc.
Is it possible? thanks a lot!
CodePudding user response:
#sample data
L = [pd.Index([x]) for x in pd.date_range('2022-12-23 08:30', freq='15Min', periods=5)]
print (L)
[DatetimeIndex(['2022-12-23 08:30:00'], dtype='datetime64[ns]', freq=None),
DatetimeIndex(['2022-12-23 08:45:00'], dtype='datetime64[ns]', freq=None),
DatetimeIndex(['2022-12-23 09:00:00'], dtype='datetime64[ns]', freq=None),
DatetimeIndex(['2022-12-23 09:15:00'], dtype='datetime64[ns]', freq=None),
DatetimeIndex(['2022-12-23 09:30:00'], dtype='datetime64[ns]', freq=None)]
Use list comprehension for tuples for minutes and hours with flattening:
df = pd.DataFrame([(y.hour, y.minute) for x in L for y in x], columns=['hour','minute'])
print (df)
hour minute
0 8 30
1 8 45
2 9 0
3 9 15
4 9 30
Or:
idx = pd.Index([y for x in L for y in x])
print (idx)
DatetimeIndex(['2022-12-23 08:30:00', '2022-12-23 08:45:00',
'2022-12-23 09:00:00', '2022-12-23 09:15:00',
'2022-12-23 09:30:00'],
dtype='datetime64[ns]', freq=None)
df = pd.DataFrame({'hour':idx.hour, 'minute':idx.minute})
print (df)
hour minute
0 8 30
1 8 45
2 9 0
3 9 15
4 9 30
If always one element index values is possible select first value by indexing:
df = pd.DataFrame([(x[0].hour, x[0].minute) for x in L], columns=['hour','minute'])
print (df)
hour minute
0 8 30
1 8 45
2 9 0
3 9 15
4 9 30
Or:
idx = pd.Index([x[0] for x in L])
print (idx)
DatetimeIndex(['2022-12-23 08:30:00', '2022-12-23 08:45:00',
'2022-12-23 09:00:00', '2022-12-23 09:15:00',
'2022-12-23 09:30:00'],
dtype='datetime64[ns]', freq=None)
df = pd.DataFrame({'hour':idx.hour, 'minute':idx.minute})
print (df)
hour minute
0 8 30
1 8 45
2 9 0
3 9 15
4 9 30
CodePudding user response:
If you have your column you can create them using:
df['hour'] = df['timestamp'].apply(lambda x: x.hour)
df['minute'] = df['timestamp'].apply(lambda x: x.minute)
Or without lambda:
df['hour'] = df['timestamp'].dt.hour
df['minute'] = df['timestamp'].dt.minute
For example:
import pandas as pd
import datetime
x = [datetime.datetime.now() datetime.timedelta(seconds=i*900) for i in range(10)]
df = pd.DataFrame(x, columns=['timestamp'])
df['hour'] = df['timestamp'].apply(lambda x: x.hour)
df['minute'] = df['timestamp'].apply(lambda x: x.minute)
result:
timestamp hour minute
0 2023-01-11 12:08:31.045411 12 8
1 2023-01-11 12:23:31.045411 12 23
2 2023-01-11 12:38:31.045411 12 38
3 2023-01-11 12:53:31.045411 12 53
4 2023-01-11 13:08:31.045411 13 8
5 2023-01-11 13:23:31.045411 13 23
6 2023-01-11 13:38:31.045411 13 38
7 2023-01-11 13:53:31.045411 13 53
8 2023-01-11 14:08:31.045411 14 8
9 2023-01-11 14:23:31.045411 14 23