Home > Net >  How to stop getting NaN values when using str.split?
How to stop getting NaN values when using str.split?

Time:01-17

How can I use str.split and stop values in column returning NaN? I'm looking through a column and trying to split multiple dates into multiple rows which is working fine, however, rows that don't have multiple dates are returning NaN but I want these to be unaffected. Please see below - table before code:

Event Date
2020-07-16
31/03/2022, 26/11/2018, 31/01/2028
df["Event Date"] = df["Event Date"].str.replace(' ', '')
df["Event Date"] = df["Event Date"].str.split(",")
df= df.explode("Event Date")
pd.to_datetime(df['Event Date'], dayfirst=True, errors='coerce')

Table after code:

Event Date
NaN
31/03/2022
26/11/2018
31/01/2028

What I'm trying to achieve:

Event Date
2020-07-16
31/03/2022
26/11/2018
31/01/2028

CodePudding user response:

Your code works well:

>>> pd.to_datetime(df['Event Date'].str.split(', ').explode(), dayfirst=True)
0   2020-07-16
1   2022-03-31
1   2018-11-26
1   2028-01-31
Name: Event Date, dtype: datetime64[ns]

CodePudding user response:

Your code works just as you intended.

import pandas as pd
df = pd.DataFrame(data={'Event Date':['2020-07-16',"31/03/2022, 26/11/2018, 31/01/2028"]})
df

Results in

    Event Date
0   2020-07-16
1   31/03/2022, 26/11/2018, 31/01/2028

And after

df["Event Date"] = df["Event Date"].str.replace(' ', '')
df["Event Date"] = df["Event Date"].str.split(",")
df= df.explode("Event Date")
pd.to_datetime(df['Event Date'], dayfirst=True, errors='coerce')

The resulting pd.Series is

0   2020-07-16
1   2022-03-31
1   2018-11-26
1   2028-01-31
Name: Event Date, dtype: datetime64[ns]
  • Related