Home > OS >  How to insert a '.' before the last characters from a number?
How to insert a '.' before the last characters from a number?

Time:12-09

Good morning for everyone, I'm working cleaning a dataset wich I had to remove special characters and replace ',' for '.'. After this I wanted to convert this column into float but it had returned me the following error:

ValueError                                Traceback (most recent call last)
<ipython-input-87-a06f3d16fade> in <module>
----> 1 df2['Price']= df2['Price'].astype(float)

~\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5544         else:
   5545             # else, only a single dtype is given
-> 5546             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors,)
   5547             return self._constructor(new_data).__finalize__(self, method="astype")
   5548 

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    593         self, dtype, copy: bool = False, errors: str = "raise"
    594     ) -> "BlockManager":
--> 595         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    596 
    597     def convert(

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, align_keys, **kwargs)
    404                 applied = b.apply(f, **kwargs)
    405             else:
--> 406                 applied = getattr(b, f)(**kwargs)
    407             result_blocks = _extend_blocks(applied, result_blocks)
    408 

~\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    593             vals1d = values.ravel()
    594             try:
--> 595                 values = astype_nansafe(vals1d, dtype, copy=True)
    596             except (ValueError, TypeError):
    597                 # e.g. astype_nansafe can fail on object-dtype of strings

~\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
    993     if copy or is_object_dtype(arr) or is_object_dtype(dtype):
    994         # Explicit copy, or required since NumPy can't view from / to object.
--> 995         return arr.astype(dtype, copy=True)
    996 
    997     return arr.view(dtype)

ValueError: could not convert string to float: '1.087.785000'

I used the code bellow before Python return me that error

import re
def remove_chars(s):
    return re.sub('[^0-9] ', '', s)
df2['Price'] = df2['Price'].apply(remove_chars)
df2["Retail"] = df2["Retail"].apply(remove_chars)
df2['Price']=df2["Price"].astype(float)
df2['Price'] = df2.Price.apply(lambda x: '{:,.3f}'.format(x))
df2["Price"] = df2["Price"].str.replace(".","")
df2["Price"] = df2["Price"].str.replace(",",".")
df2['Price']= df2['Price'].astype(float)

CodePudding user response:

If you want to specify the character to recognize as a decimal point, you can do it during read_csv using decimal without no need to modify them later. E.g.

df = pd.read_csv("<your_file>", decimal = ",")

The above code will automatically recognize the number with , decimal point.

As well as, if you want to specify thousands separator, you can also do it during read_csv using thousands.

CodePudding user response:

You should apply this first:

df2["Price"] = df2["Price"].str.replace(",",".")    
df2["Price"] = df2["Price"].str.replace(".","")

then:

df2['Price']=df2["Price"].astype(float)
df2['Price'] = df2.Price.apply(lambda x: '{:,.3f}'.format(x))
  • Related