Home > Software design >  delete specific elements from a column
delete specific elements from a column

Time:10-01

I have a dataframe like this:

df1 = pd.DataFrame({'Parent': ['Stay home', "Stay home","Stay home", 'Go outside'],
                    'Child' : ['Severe weather', "Severe weather", "Severe weather", 'Sunny'],
                    'Score': ['(Score: 0.0310)', '(Score: 0.0310)', '(Score: 0.0310)', '(Score: 0.0310)']})


    Parent      Child           Score
0   Stay home   Severe weather  (Score: 0.0310)
1   Stay home   Severe weather  (Score: 0.0310)
2   Stay home   Severe weather  (Score: 0.0310)
3   Go outside  Sunny           (Score: 0.0310)

I want to delete the parenthesis and score: from the score column:

    Parent      Child           Score
0   Stay home   Severe weather  0.0310
1   Stay home   Severe weather  0.0310
2   Stay home   Severe weather  0.0310
3   Go outside  Sunny           0.0310

Any ideas?

CodePudding user response:

It's probably better to extract the number:

df1['Score'] = df1['Score'].str.extract('(\d (?:\.\d )?)')

output:

       Parent           Child   Score
0   Stay home  Severe weather  0.0310
1   Stay home  Severe weather  0.0310
2   Stay home  Severe weather  0.0310
3  Go outside           Sunny  0.0310

CodePudding user response:

Can use re and findall

import re

df1['Score'] = df1['Score'].apply(lambda x: re.findall('Score: (.*?)\)', x)[0])

>>> df1

       Parent           Child   Score
0   Stay home  Severe weather  0.0310
1   Stay home  Severe weather  0.0310
2   Stay home  Severe weather  0.0310
3  Go outside           Sunny  0.0310

CodePudding user response:

Another possible solution, based on pandas.Series.str.strip:

df1['Score'] = df1['Score'].str.strip('\(Score: |\)')

Output:

       Parent           Child   Score
0   Stay home  Severe weather  0.0310
1   Stay home  Severe weather  0.0310
2   Stay home  Severe weather  0.0310
3  Go outside           Sunny  0.0310
  • Related