Home > Enterprise >  Converting columns with hours to datetime type pandas
Converting columns with hours to datetime type pandas

Time:12-02

I try to convert my column with "time" in the form "hr hr: min min :sec sec" in my pandas frame from object to date time 64 as I want to filter for hours.

I tried new['Time'] = pd.to_datetime(new['Time'], format='%H:%M:%S').dt.time which has no effect at all (it is still an object). I also tried new['Time'] = pd.to_datetime(new['Time'],infer_datetime_format=True)

which gets the error message: TypeError: <class 'datetime.time'> is not convertible to datetime

I want to be able to sort my data frame for hours.

  1. How do i convert the object to the hour?
  2. can I then filter by hour (for example everything after 8am) or do I have to enter the exact value with minutes and seconds to filter for it?

Thank you

CodePudding user response:

IIUC, you already have a time structure from datetime module:

Suppose this dataframe:

from datetime import time

df = pd.DataFrame({'Time': [time(10, 39, 23), time(8, 47, 59), time(9, 21, 12)]})
print(df)

# Output:
       Time
0  10:39:23
1  08:47:59
2  09:21:12

Few operations:

# Check if you have really `time` instance
>>> df['Time'].iloc[0]
datetime.time(10, 39, 23)

# Sort values by time
>>> df.sort_values('Time')
       Time
1  08:47:59
2  09:21:12
0  10:39:23

# Extract rows from 08:00 and 09:00
>>> df[df['Time'].between(time(8), time(9))]
       Time
1  08:47:59

CodePudding user response:

If you want your df['Time'] to be of type datetime64 just use

df['Time'] = pd.to_datetime(df['Time'], format='%H:%M:%S')
print(df['Time'])

This will result in the following column

0      1900-01-01 00:00:00
1      1900-01-01 00:01:00
2      1900-01-01 00:02:00
3      1900-01-01 00:03:00
4      1900-01-01 00:04:00
               ...        
1435   1900-01-01 23:55:00
1436   1900-01-01 23:56:00
1437   1900-01-01 23:57:00
1438   1900-01-01 23:58:00
1439   1900-01-01 23:59:00
Name: Time, Length: 1440, dtype: datetime64[ns]

If you just want to extract the hour from the timestamp extent pd.to_datetime(...) by .dt.hour


If you want to group your values on an hourly basis you can also use (after converting the df['Time'] to datetime):

new_df = df.groupby(pd.Grouper(key='Time', freq='H'))['Value'].agg({pd.Series.to_list})

This will return all values grouped by hour.

  • Related