Home > Back-end >  Dataframe string slicing
Dataframe string slicing

Time:09-29

I have a Dataframe which contains a column where the values are:

  • abc_0.1
  • aabbcc_-0.010
  • qwerty_0.555

How can I use the lambda function to transform the column values into simply numeric values:

  • 0.1
  • -0.010
  • 0.555

CodePudding user response:

Does this answer your question ?

df = pd.DataFrame({'col': [
    'abc_0.1',
    'aabbcc_-0.010',
    'qwerty_0.555',
]})
df['col'] = df['col'].str.extract(r'[a-zA-Z] _(.*)').astype(float)
df
    col
0   0.100
1   -0.010
2   0.555

CodePudding user response:

You can use str.extract with the regex (-?\d (?:\.\d )?)$ and optionally convert to_numeric:

df['num'] = pd.to_numeric(df['col'].str.extract(r'(-?\d (?:\.\d )?)$', expand=False))

output:

             col    num
0        abc_0.1  0.100
1  aabbcc_-0.010 -0.010
2   qwerty_0.555  0.555

Regex:

regex demo

-?          # optionally match a - sign
\d          # match one or more digits
(?:\.\d )?  # optionally match a dot and digit(s)
$           # match end of string

CodePudding user response:

#extract a group comprising of any digit, period or a minus sign, occurring one or more times

df['text'].str.extract(r'([\d\.\-] )' )
0
0   0.1
1   -0.1
2   0.555
  • Related