I have a column of my dataframe that is made up of the following:
df['Year] = [2025, 2024, NaN, 2023, 2026, NaN]
(these are type float64
)
How can I convert these years to something in datetime format? Since there are no months or days included I feel like they have to output as [01-01-2025, 01-01-2021, NaT, 01-01-2023, 01-01-2026, NaT]
by default.
But if there was a way to still have the column as [2025, 2024, NaT, 2023, 2026, NaT]
then that would work well too.
Using df['Year'] = pd.DatetimeIndex(df['Year']).year
just output [1970, 1970, NaN, 1970, 1970, NaN]
.
Thank you very much.
CodePudding user response:
You can use pandas' to_datetime()
and set errors='coerce'
to take care of the NaNs (-> NaT)
df['Year'] = pd.to_datetime(df['Year'], format='%Y', errors='coerce')
The output is going to be like 01-01-2025, 01-01-2021 ...
CodePudding user response:
Probably not the most elegant solution but if you convert the column to string and fill the empty with a dummy year (say 1900) you can use parser from dateutil
from dateutil import parser
('01/01/' df['year']).fillna('1900').apply(parser.parse)
Out[67]: 0 2025-01-01 1 2024-01-01 2 1900-07-21 3 2023-01-01 4 2026-01-01 5 1900-07-21