As you can see my df contains a price list with values like $106.00
and '1,190.00
. I want to get the values to a numeric value. So I want to replace the $ sign. But that didn't work.
df = pd.DataFrame({'id':['A', 'B', 'C', 'D', 'E'], 'price':['$106.00', '$156.00',
'$166.00', '$106.00', '1,190.00']})
df['price'] = pd.to_numeric(df.price.str.replace("$",""))
# df['price'] = pd.to_numeric(df.price.str[1:])
# that givs me a ValueError: Unable to parse string "1,925.00" at position 7765
What I want at the end
ID price
A 106.00
B 156.00
C 166.00
D 106.00
E 1,190.00
CodePudding user response:
You can use regex
and replace '\$'
and ','
with ''
then convert to numeric like below: (we use '|'
for search $
or ,
)
>>> df.price = pd.to_numeric(df.price.str.replace(r"\$|,","", regex=True))
>>> df
id price
0 A 106.0
1 B 156.0
2 C 166.0
3 D 106.0
4 E 1190.0
CodePudding user response:
Try like this. Using regex substitute all the non-digit values with '' and then convert it to decimal. Apply this for all values in 'price' list
import pandas as pd
from re import sub
from decimal import Decimal
df = pd.DataFrame({'id':['A', 'B', 'C', 'D', 'E'], 'price':['$106.00', '$156.00',
'$166.00', '$106.00', '1,190.00']})
df['price']=df['price'].apply(lambda x: Decimal(sub(r'[^\d.]', '', x)))
print(df)
output
id price
0 A 106.00
1 B 156.00
2 C 166.00
3 D 106.00
4 E 1190.00