I see quite a number of questions regarding assigning dtype, but most of them are outdated and recommending manual assignment.
Looks new method df.convert_dtypes()
are available, but somehow it does not work for my case.
When I load csv files, all columns dytpe are object, and even after doing convert_dytpes()
, dytpes are still object.
I would really appreciate if you can help.
df = pd.read_csv('file.csv', header=5)
df = df.convert_dtypes()
CodePudding user response:
Could you provide some data? It seems to work fine to me:
import pandas as pd
import numpy as np
print('pandas version:', pd.__version__)
print('numpy version :', np.__version__)
df = pd.DataFrame(
{'string': pd.Series(['a', 'b', 'c'], dtype=np.dtype('O')),
'bool': pd.Series([True, False, True], dtype=np.dtype('O')),
'int': pd.Series([0, 1, 2], dtype=np.dtype('O')),
'float': pd.Series([0.0, 1.1, 2.2], dtype=np.dtype('O')),
}
)
print(df)
print(df.dtypes)
df_new = df.convert_dtypes()
print(df_new.dtypes)
Output:
pandas version: 1.3.2
numpy version : 1.20.3
string bool int float
0 a True 0 0.0
1 b False 1 1.1
2 c True 2 2.2
string object
bool object
int object
float object
dtype: object
string string
bool boolean
int Int64
float Float64
dtype: object
CodePudding user response:
Try this :
import pandas as pd
df = pd.DataFrame({'A': ['a', 'b', 'c'], 'B': ['d', 'e', 'f']})
print("--------DataType of DataFrame---------")
print(df.dtypes)
print("--------DataType of DataFrame after converting---------")
df1=df.convert_dtypes()
print(df1.dtypes)