I have a dataset with the column price and I need to change the data type to float64.
This is da data of price when I open de dataset with pandas:
0 29.9
1 39.9
2 499.99
3 539.99
4 519.99
...
397729 139.99
When I try to change it i get this message ValueError: could not convert string to float: ''
When I try to see where this '' "empty spaces or whatever they are" this is what I see:
df.price.apply(lambda x: x.replace('','x'))
0 x2x9x.x9x
1 x3x9x.x9x
2 x4x9x9x.x9x9x
3 x5x3x9x.x9x9x
4 x5x1x9x.x9x9x
...
397729 x1x3x9x.x9x9x
I have tried replacing the values twice but it stay de the same, with the '' in the middle. I cannot replace them with 0 cause I need the values.
I need my data like this but been able to change it to float.
0 29.9
1 39.9
2 499.99
3 539.99
4 519.99
...
397729 139.99
CodePudding user response:
You can use pd.to_numeric
df['price'] = pd.to_numeric(df['price'], errors='coerce')
CodePudding user response:
Probably you can try below:
import pandas as pd
df = pd.DataFrame({"id": ["0", "1", "2", "3", "4"],
"price": ["29.9", "39.9", "499.9", "539.99", "519.99"]})
print(type(df["price"][0]))
df = df.astype({"price": float})
print(type(df["price"][0]))
Result:
<class 'str'>
<class 'numpy.float64'>