I am trying to extract hh:mm:ss from a column
Column1 Column 2 Column 3
1 2 12:30:11
2 3 1904-01-01 13:40:20
3 4 20:03:04
In column three there are some entries starting with 1904-01-01. I just want to get hh:mm:ss from the column.
CodePudding user response:
If values are strings filter last 8 values:
df['Column 3'] = df['Column 3'].str[-8:]
print (df)
Column1 Column 2 Column 3
1 2 12:30:11
2 3 13:40:20
3 4 20:03:04
Or convert to datetimes and get HH:MM:SS
format:
df['Column 3'] = pd.to_datetime(df['Column 3']).dt.strftime('%H:%M:%S')
CodePudding user response:
A safe way is to use a regex, thus if you have NaN or non-aligned strings (e.g. trailing space), you'll get exactly what you want:
df['Column 3'] = df['Column 3'].str.extract('(\d{2}:\d{2}:\d{2})')
output:
Column1 Column 2 Column 3
0 1 2 12:30:11
1 2 3 13:40:20
2 3 4 20:03:04