I have a column with the following type of values:
I have to define a function that convert column Diameter in float and manage that kind of exception. In particular:
- when is 42x54 then make the operation: sqrt(42^2 54^2)
- when is Steel then return a NaN value
CodePudding user response:
You can use str.extract
with a regex to get the numbers, then square them with pow
, sum
the columns, and get the square root with numpy.sqrt
:
import numpy as np
df['Diameter2'] = np.sqrt(df['Diameter']
.str.extract('(\d )(?:\s*x\s*(\d ))?')
.astype(float).pow(2)
.sum(axis=1, min_count=1)
)
output:
Diameter Diameter2
0 44 44.000000
1 42 x 54 68.410526
2 Steel NaN
(\d ) # capture a number
(?:\s*x\s*(\d ))? # capture a number (optionally) if preceded by x with optional spaces