I am looking to add a new column - "date" to my Pandas dataframe. Below are the first 5 rows of my dataframe: First 5 rows of the dataframe As seen from the image, the first column is year, second month, and third day. Below is what I have tried to do:
df['Year'] = pd.to_datetime(df[['Year','Month','Day']])
But, I keep getting the error as below:
ValueError: cannot assemble the datetimes: time data '610101' does not match format
'%Y%m%d' (match)
It would be great if I can get any help for the same.
CodePudding user response:
Following up on my comment, I was able to reproduce the error and solve it by adding 1900 to the year
df = pd.DataFrame({"year": [61,99], "month": [1, 2], "day": [3, 12]})
df["year"] = df["year"] 1900
df['full_date'] = pd.to_datetime(df[['year','month','day']])
Output:
year month day full_date
0 1961 1 3 1961-01-03
1 1999 2 12 1999-02-12
There is a format
parameter to the to_datetime
method but for some reason I wasn't able to make it work. doc
df['full_date'] = pd.to_datetime(df[['year','month','day']], format="%y%m%d", infer_datetime_format=False)
this still throw the same error although I am using %y
which should be 2 digit year but the error message still says it does not match this format '%Y%m%d'
CodePudding user response:
try this:
df.apply(lambda x:'%s %s %s' % (x['year'],x['month'], x['day']),axis=1)