I'm trying to change the date format to a column but I cant apply my function because I have NaN cells.
# change date format
def dmy_to_dmy(d):
return datetime.strptime(d, '%d %B %Y').strftime('%d/%m/%Y')
dates2['Dates_Reins'] = dates2['Dates_Reins'].apply(dmy_to_dmy)
My data looks like this:
46 9 September 2021
47 NaN
48 24 July 2021
49 28 September 2021
50 18 October 2021
51 8 January 2021
52 NaN
Thanks
CodePudding user response:
There are two options, ingoring the problem or detecting the problem. The first one is the easy one since you try and pass if it doesn't happen. The second one will detect the NaN
First solution
# change date format
def dmy_to_dmy(d):
try:
return datetime.strptime(d, '%d %B %Y').strftime('%d/%m/%Y')
except TypeError:
return d
dates2['Dates_Reins'] = dates2['Dates_Reins'].apply(dmy_to_dmy)
Second solution
# change date format
def dmy_to_dmy(d):
if d == np.nan:
return d
else:
return datetime.strptime(d, '%d %B %Y').strftime('%d/%m/%Y')
dates2['Dates_Reins'] = dates2['Dates_Reins'].apply(dmy_to_dmy)
Since you haven't provided the data d==np.nan
you have to find for yourself
CodePudding user response:
You can detect whether the input to your function is a Nan
value and return it before you start to operate with it:
def dmy_to_dmy(d):
if d == np.nan:
return np.nan
return datetime.strptime(d, '%d %B %Y').strftime('%d/%m/%Y')
CodePudding user response:
You can either make your function ignore NaN:
def dmy_to_dmy(d):
if isinstance(d, float) and np.isnan(d):
return np.nan
return datetime.strptime(d, '%d %B %Y').strftime('%d/%m/%Y')
dates2['Dates_Reins'] = dates2['Dates_Reins'].apply(dmy_to_dmy)
Or, call your function on only the non-NaN values:
notna = dates2['Dates_Reins'].notna()
dates2['Dates_Reins'] = dates2.loc[notna, 'Dates_Reins'].apply(dmy_to_dmy)
CodePudding user response:
Try:
dates2['Dates_Reins'].apply(lambda x: dmy_to_dmy(x) if x != np.nan else x)
Inspired by: https://stackoverflow.com/a/56589062/7509907
CodePudding user response:
I wouldn't use apply
here:
dates2['Dates_Reins'] = (
pd.to_datetime(dates2['Dates_Reins'], format='%d %B %Y')
.dt.strftime('%d/%m/%Y')
)
pd.to_datetime()
can replacestrptime()
- The
.dt
accessor allows direct access tostrftime()
This is faster and deals with missing values implicitly.