I have several columns that I want to extract the numbers from a string. I have this example:
I've tried this code to extract the number for both columns:
df1 = df['Diatom col', 'Cistos col'].str.extract('(\d )')
But it's not working.
And that's the ouput I need:
CodePudding user response:
here is one way to do it
If you post a data as a text or code, I'll be able to share the result.
Assumption: all numbers are together and not interspersed with non-digits, with exception of , and .
# replace out all characters that are not digits or comma or period.
(df[['Type','Size']]
.apply(lambda x: x.str.replace(r'[^\d\.,]','', regex=True) , axis=1)
.reset_index())