Home > Back-end >  Extracting hh:mm:ss from a column in a dataframe
Extracting hh:mm:ss from a column in a dataframe

Time:02-22

I am trying to extract hh:mm:ss from a column

Column1 Column 2 Column 3
1        2     12:30:11
2        3     1904-01-01 13:40:20
3        4     20:03:04

In column three there are some entries starting with 1904-01-01. I just want to get hh:mm:ss from the column.

CodePudding user response:

If values are strings filter last 8 values:

df['Column 3'] = df['Column 3'].str[-8:]
print (df)
   Column1 Column 2  Column 3
1                 2  12:30:11
2                 3  13:40:20
3                 4  20:03:04

Or convert to datetimes and get HH:MM:SS format:

df['Column 3'] = pd.to_datetime(df['Column 3']).dt.strftime('%H:%M:%S')

CodePudding user response:

A safe way is to use a regex, thus if you have NaN or non-aligned strings (e.g. trailing space), you'll get exactly what you want:

df['Column 3'] = df['Column 3'].str.extract('(\d{2}:\d{2}:\d{2})')

output:

   Column1  Column 2  Column 3
0        1         2  12:30:11
1        2         3  13:40:20
2        3         4  20:03:04
  • Related