Here is my sample dataframe. I would like to convert the dtypes
to boolean
in column A
and B
, string
in C
and integer
in D
and E
.
I am trying to use panda's method convert_dtypes()
but it returns string
for every one. How can I "automatically" convert the types?
{'A': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: 'true',
8: nan,
9: 'true'},
'B': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: 'true',
8: nan,
9: 'true'},
'C': {0: 'CustomersData',
1: 'CustomersData',
2: 'CustomersData',
3: 'CustomersData',
4: 'CustomersData',
5: 'CustomersData',
6: 'CustomersData',
7: 'TestData',
8: 'CustomersData',
9: 'CustomersData'},
'D': {0: '4014',
1: '4014',
2: '4014',
3: '4014',
4: '4014',
5: '4014',
6: '4014',
7: '500',
8: '4014',
9: '500'},
'E': {0: '8',
1: '8',
2: '8',
3: '8',
4: '8',
5: '8',
6: '13',
7: '13',
8: '8',
9: '13'}}
df.convert_types().dtypes
gives:
A string
B string
C string
D string
E string
CodePudding user response:
The only way that worked for me is a "workaround" to save it as csv and load again. Pandas read_csv infers the types of the columns and worked for me. Will be happy to know if I can solve it without this workaround.
CodePudding user response:
First check which dtypes the dataframe holds per column.
print(df.dtypes)
Then change all column value types.
df['A'] = df['A'].astype(bool)
df['B'] = df['B'].astype(bool)
df['C'] = df['C'].astype(str)
df['D'] = df['D'].astype(int)
df['E'] = df['E'].astype(int)
Then check if if values properly converted.
print(df.dtypes)
This should work. Let me know if it does not.