My pandas data frame is as below. IN the "price" column, I want to convert non digit values to nan. Thanks.
Input:
ID, name, qty, price
1, pen, 4, 55
2, book, 5, ai
3, PENCIL, 5, 100
4, book2, 10, si
output:
ID, name, qty, price
1, pen, 4, 55
2, book, 5, nan
3, PENCIL, 5, 100
4, book2, 10, nan
CodePudding user response:
Use pd.to_numeric
:
df['price'] = pd.to_numeric(df['price'], errors='coerce')
If they're strings, use mask
and str.isnumeric
:
df['price'] = df['price'].mask(~df['price'].str.isnumeric())
mask
automatically converts data that match the condition to NaN
.
Or use str.isaplha
:
df['price'] = df['price'].mask(df['price'].str.isalpha())
CodePudding user response:
You could use str.replace
here:
df["price"] = df["price"].str.replace(r'^.*\D.*$', np.NaN)
The regex pattern ^.*\D.*$
will match any price
entry having at least one non digit character, and will selectively replace with nan
.