I have a dataset that looks like this
and the datatypes looks like this
I am attempting the change the datatype of cubicinches and weightlbs into float or integer but none of the options is working:
df["cubicinches"]=df["cubicinches"].astype(float)
df = df.astype({"weightlbs": float, "cubicinches": float})
df['weightlbs'] = pd.to_numeric(df['weightlbs'])
please help
CodePudding user response:
Note: it seems most of your columns have a leading whitespace before their name: 'cubicinches'
is ' cubicinches'
. What is exactly the error message?
Try before:
df.columns = df.columns.str.strip()
If you can't cast to float
it's because you have some non numeric values in your column. As suggested by @enke, use pd.to_numeric
in a different way.
You can find wrong values with:
out = df.loc[pd.to_numeric(df['cubicinches'], errors='coerce').isna(), 'cubicinches']
print(out)
# Output
2 wrong
3 value
Name: cubicinches, dtype: object
Setup:
df = pd.DataFrame({'cubicinches': ['3.2', '0.8', 'wrong', 'value', '7.78']})
print(df)
# Output
cubicinches
0 3.2
1 0.8
2 wrong
3 value
4 7.78
CodePudding user response:
There could be an entry in one of the mentioned columns that is an actual string with letters or so. To check, you can get the unique values for each column to try to find it.
e.g.
df["cubicinches"].value_counts()