I'm getting this dict as input, I want to make sure that columns that have all numbers are given float / integer types. to do that I'm using pd.to_numeric on this dataFrame:
df = pd.DataFrame({"a": ["1.1", "2"], "b": ["2", "blue"], "c": ["1", "2"]})
then I'm using this func -
def convert_string_columns_to_numeric(df: pd.DataFrame):
for col in df.columns:
series_column = pd.Series(df[col])
df[col] = pd.to_numeric(series_column, errors='ignore')
return df
I get these dtypes:
["float","object", "object"]
Why isn't "c" of int type?
CodePudding user response:
Its of int type:
df = pd.DataFrame(data = {"a":["1","1.2"],"b":["2","bla"],"c":["2","4"]})
df = df.apply(pd.to_numeric, errors='ignore')
print(df.dtypes)
Output:
a float64
b object
c int64
dtype: object
CodePudding user response:
If you want column c to be of type int, you should not use quotation marks. Why not create the dataframe like this (note quotation marks removed from column a as well):
df = pd.DataFrame({"a":[1, 1.2],"b":["2","bla"], "c":[2, 4]})
Now df.dtypes will give you:
a float64
b object
c int64