Home > Mobile >  Check if data pandas.series value contains numeric character
Check if data pandas.series value contains numeric character

Time:10-05

Is there a way to check if a pandas series value contains any numeric characters and replace those who does not contain any with NaN? Series.str.isnumeric only checks whether all characters are numeric.

Given the following series:

d = {'a': "Python", 'b': "$|", 'c': "|32", "c":"dos"}
ser = pd.Series(data=d, index=['a', 'b', 'c', "c"])

The only values that contain numeric characters are c, so that the values of a and b should be replaced with NaN. I am struggling to find a solution for this problem as there are multiple values for c which is therefore a pd.series by itself, according to type(ser["c"])

CodePudding user response:

Use any apply:

res = ser.apply(lambda x: x if any(c.isnumeric() for c in x) else pd.NA)
print(res)

Output

a    <NA>
b    <NA>
c     |32
c     |32
dtype: object

CodePudding user response:

Try str.contains('\d') to check for existence of a digit. Then groupby().transform('any') to propagate that to the whole group. Finally, use where to mask the values of the series:

ser.where(ser.str.contains('\d').groupby(level=0).transform('any'))

Output:

a    NaN
b    NaN
c    |32
c    dos
dtype: object
  • Related