Home > Software design >  how to convert in python negative value objects in dataframe to float
how to convert in python negative value objects in dataframe to float

Time:06-28

may be someone can help me. Would like to create function to convert objects to float. Tried to find some solution, but always get some errors:

# sample dataframe
d = {'price':['−$13.79', '−$ 13234.79', '$ 132834.79', 'R$ 75.900,00', 'R$ 69.375,12', '- $ 2344.92']}
df = pd.DataFrame(data=d)

I tried this code, first wanted just to find solution.

df['price'] = (df.price.str.replace("−$", "-").str.replace(r'\w \$\s ', '').str.replace('.', '')\
                   .str.replace(',', '').astype(float)) / 100

So idea was to convert -$ to - (for negative values). Then $ to ''.

But as a result I get:

ValueError: could not convert string to float: '−$1379'

CodePudding user response:

You can extract the numbers on one side, and identify whether there is a minus in the other side, then combine:

factor = np.where(df['price'].str.match(r'[−-]'), -1, 1)/100
out = (pd.to_numeric(df['price'].str.replace(r'\D', '', regex=True), errors='coerce')
         .mul(factor)
       )

output:

0       -13.79
1    -13234.79
2    132834.79
3     75900.00
4     69375.12
5     -2344.92
Name: price, dtype: float64

CodePudding user response:

Can you use re ?

Like this:

import re
df['price'] = float(re.sub(r'[^\-.0-9]', '', df.price.str)) / 100

I'm just removing by regex all the symbols that are not 0-9, ".", "," & "-".

BTW, no clue why you divide it by 100...

CodePudding user response:

df["price2"] = pd.to_numeric(df["price"].str.replace("[R$\s\.,]", "")) / 100
df["price3"] = df["price"].str.replace("[R$\s\.,]", "").astype(float) / 100
df

A few notes:

The dot is the regex symbel for everything. The - symbel you are using is not a minus. Its something else.

CodePudding user response:

df["price2"] = pd.to_numeric(df["price"].str.replace("[R$\s\.,]", "")) / 100
df["price3"] = df["price"].str.replace("[R$\s\.,]", "").astype(float) / 100
df

A few notes:

The dot is the regex symbel for everything. The - symbel you are using is not a minus. Its something else. I would use something like https://regex101.com for debugging.

  • Related