Date
2021/8/1
8-5-2021
8-6-2021:08:00:00 PM
I would like all the values in this column to be in the format yyyy-m-dd.
This is what I have been trying but it gives me an error saying unknown string format:
df['Date']= pd.to_datetime(df['Date'].str.split(':', n=1).str[0])
CodePudding user response:
You could apply a custom function on your Date column to parse the date values.
import pandas as pd
import io
from datetime import datetime
temp_data=u"""Date
8/1/2021
8-5-2021
8-6-2021:08:00:00 PM
"""
data = pd.read_csv(io.StringIO(temp_data), sep=";", parse_dates=False)
def to_date(string_date):
formats=["%d/%m/%Y","%d-%m-%Y","%d-%m-%Y:%I:%M:%S %p"]
parsed_date=None
for format in formats:
try:
parsed_date=datetime.strptime('8-6-2021:08:00:00 PM', "%d-%m-%Y:%I:%M:%S %p").date()
return parsed_date
except ValueError:
pass
raise RuntimeError(f"Unable to parse date {string_date} with available formats={formats}")
You can add new formats in the to_date function to parse any new date format.
data['Date']=data['Date'].apply(lambda row: to_date(row))
>>> data
Date
0 2021-06-08
1 2021-06-08
2 2021-06-08
CodePudding user response:
Make use of the format
option. For more info, check the to_datetime documentation.
df['Date'] = pd.to_datetime(df['Date'], format='%Y%m%d')