Here is the Data Set cut into smaller bits:
df_wiki.Release.head(50)
Title
100 Days My Prince September 10 – October 30, 2018
A Gentleman's Dignity 26 May –12 August 2012
A Model Family 2022
Adamas July 27, 2022
Alchemy of Souls June 18, 2022 –presentJune 18, 2022 –present
Alice none
All About Eve April 26 – July 6, 2000
Name: Release, dtype: object
I have tried converting it to astype strings, chained lstrip, strip, replace and delete all white spaces but the dash wont vanish.
df_wiki.Release.astype(str).str.replace(' ','').str.split('-', expand=True)[0].head(50)
df_wiki.Release.str.lstrip().str.split('-', expand=True).head(50)
It just ends up looking like this:
100 Days My Prince September10 –October30,2018
A Gentleman's Dignity 26May –12August2012
A Model Family 2022
Adamas July27,2022
Alchemy of Souls June18,2022 –presentJune18,2022–present
Alice none
All About Eve April26–July6,2000
Name: Release, dtype: object
This is what I wanted it to look like after using split command:
df_wiki[['Start', 'End']] = df_wiki['Release'].str.split('-', expand=True)
df_wiki.drop('Release', axis=1, inplace=True)
Title START END
100 Days My Prince September 10, 2018 October 30, 2018
A Gentleman's Dignity May 26, 2012 August 12, 2012
A Model Family none 2022
Adamas none July 27, 2022
Alchemy of Souls June 18, 2022 June 20, 2022
Alice none none
All About Eve April 26, 2000 July 6, 2000
Name: Release, dtype: object
Thanks again for your help.
CodePudding user response:
In your sample
'–' != '-'
Out[840]: True
So change the sep to right one
df_wiki[['Start', 'End']] = df_wiki['Release'].str.split('–', expand=True)