Here is my original dataframe:
creationDate
0 2022-06-02T12:41:47.498Z
1 2022-06-02T12:45:13.994Z
2 2022-06-02T12:58:57.684Z
3 2022-06-02T13:24:59.379Z
4 2022-06-02T13:29:57.304Z
I want to transform it this way:
creationDate
0 2022-06-02
1 2022-06-02
2 2022-06-02
3 2022-06-02
4 2022-06-02
Here is what I do:
df["creationDate"] = pd.to_datetime(df["creationDate"]).dt.strftime("%Y-%m-%d")
Or:
df["creationDate"] = df["creationDate"].apply(lambda x: x[:10])
Which is working.
But, I feel like there is another easier way.
How would you do it?
CodePudding user response:
I would suggest sticking with the first option. For the second option, you can use
df["creationDate"].str[:10]
instead. I don't think there are any better solutions.
CodePudding user response:
Your first method is the right way for me, however you can use:
>>> pd.to_datetime(df['creationDate']).dt.date
0 2022-06-02
1 2022-06-02
2 2022-06-02
3 2022-06-02
4 2022-06-02
Name: creationDate, dtype: object # <- not a datetime64 but a string
If you want to keep the column as DatetimeIndex, use:
>>> pd.to_datetime(df['creationDate']).dt.tz_localize(None).dt.normalize()
0 2022-06-02
1 2022-06-02
2 2022-06-02
3 2022-06-02
4 2022-06-02
Name: creationDate, dtype: datetime64[ns] # <- now you have a real datetime
As you have iso format date, you have the timezone so remove it (localize
) then reset the time part (normalize
).