Home > Net >  Why is it not spliting or even able get rid of the dash?
Why is it not spliting or even able get rid of the dash?

Time:06-21

Here is the Data Set cut into smaller bits:

df_wiki.Release.head(50)
   
    Title
100 Days My Prince                         September 10 – October 30, 2018
A Gentleman's Dignity                             26  May –12  August 2012
A Model Family                                                        2022
Adamas                                                       July 27, 2022
Alchemy of Souls              June 18, 2022 –presentJune 18, 2022 –present
Alice                                                           none
All About Eve                                      April 26 – July 6, 2000
Name: Release, dtype: object

I have tried converting it to astype strings, chained lstrip, strip, replace and delete all white spaces but the dash wont vanish.

df_wiki.Release.astype(str).str.replace(' ','').str.split('-', expand=True)[0].head(50)

df_wiki.Release.str.lstrip().str.split('-', expand=True).head(50)

It just ends up looking like this:

100 Days My Prince                         September10 –October30,2018
A Gentleman's Dignity                             26May –12August2012
A Model Family                                                      2022
Adamas                                                       July27,2022
Alchemy of Souls              June18,2022 –presentJune18,2022–present
Alice                                                           none
All About Eve                                      April26–July6,2000
Name: Release, dtype: object

This is what I wanted it to look like after using split command:

df_wiki[['Start', 'End']] = df_wiki['Release'].str.split('-', expand=True)
df_wiki.drop('Release', axis=1, inplace=True)


     Title                                      START                END
100 Days My Prince                   September 10, 2018        October 30, 2018
A Gentleman's Dignity                      May 26, 2012        August  12, 2012
A Model Family                                 none                        2022
Adamas                                         none               July 27, 2022
Alchemy of Souls                           June 18, 2022          June 20, 2022
Alice                                           none                     none
All About Eve                             April 26, 2000           July 6, 2000
Name: Release, dtype: object

Thanks again for your help.

CodePudding user response:

In your sample

'–' != '-'  
Out[840]: True  

So change the sep to right one

df_wiki[['Start', 'End']] = df_wiki['Release'].str.split('–', expand=True)
  • Related