How can I use str.split and stop values in column returning NaN? I'm looking through a column and trying to split multiple dates into multiple rows which is working fine, however, rows that don't have multiple dates are returning NaN but I want these to be unaffected. Please see below - table before code:
Event Date |
---|
2020-07-16 |
31/03/2022, 26/11/2018, 31/01/2028 |
df["Event Date"] = df["Event Date"].str.replace(' ', '')
df["Event Date"] = df["Event Date"].str.split(",")
df= df.explode("Event Date")
pd.to_datetime(df['Event Date'], dayfirst=True, errors='coerce')
Table after code:
Event Date |
---|
NaN |
31/03/2022 |
26/11/2018 |
31/01/2028 |
What I'm trying to achieve:
Event Date |
---|
2020-07-16 |
31/03/2022 |
26/11/2018 |
31/01/2028 |
CodePudding user response:
Your code works well:
>>> pd.to_datetime(df['Event Date'].str.split(', ').explode(), dayfirst=True)
0 2020-07-16
1 2022-03-31
1 2018-11-26
1 2028-01-31
Name: Event Date, dtype: datetime64[ns]
CodePudding user response:
Your code works just as you intended.
import pandas as pd
df = pd.DataFrame(data={'Event Date':['2020-07-16',"31/03/2022, 26/11/2018, 31/01/2028"]})
df
Results in
Event Date
0 2020-07-16
1 31/03/2022, 26/11/2018, 31/01/2028
And after
df["Event Date"] = df["Event Date"].str.replace(' ', '')
df["Event Date"] = df["Event Date"].str.split(",")
df= df.explode("Event Date")
pd.to_datetime(df['Event Date'], dayfirst=True, errors='coerce')
The resulting pd.Series is
0 2020-07-16
1 2022-03-31
1 2018-11-26
1 2028-01-31
Name: Event Date, dtype: datetime64[ns]