Home > Back-end >  How to infer and convert dtypes in pandas dataframe?
How to infer and convert dtypes in pandas dataframe?

Time:12-16

Here is my sample dataframe. I would like to convert the dtypes to boolean in column A and B, string in C and integer in D and E. I am trying to use panda's method convert_dtypes() but it returns string for every one. How can I "automatically" convert the types?

{'A': {0: nan,
      1: nan,
      2: nan,
      3: nan,
      4: nan,
      5: nan,
      6: nan,
      7: 'true',
      8: nan,
      9: 'true'},
     'B': {0: nan,
      1: nan,
      2: nan,
      3: nan,
      4: nan,
      5: nan,
      6: nan,
      7: 'true',
      8: nan,
      9: 'true'},
     'C': {0: 'CustomersData',
      1: 'CustomersData',
      2: 'CustomersData',
      3: 'CustomersData',
      4: 'CustomersData',
      5: 'CustomersData',
      6: 'CustomersData',
      7: 'TestData',
      8: 'CustomersData',
      9: 'CustomersData'},
     'D': {0: '4014',
      1: '4014',
      2: '4014',
      3: '4014',
      4: '4014',
      5: '4014',
      6: '4014',
      7: '500',
      8: '4014',
      9: '500'},
     'E': {0: '8',
      1: '8',
      2: '8',
      3: '8',
      4: '8',
      5: '8',
      6: '13',
      7: '13',
      8: '8',
      9: '13'}}

df.convert_types().dtypes gives:

A string
B string
C string
D string
E string

CodePudding user response:

The only way that worked for me is a "workaround" to save it as csv and load again. Pandas read_csv infers the types of the columns and worked for me. Will be happy to know if I can solve it without this workaround.

CodePudding user response:

First check which dtypes the dataframe holds per column.

print(df.dtypes)

Then change all column value types.

df['A'] = df['A'].astype(bool)
df['B'] = df['B'].astype(bool)
df['C'] = df['C'].astype(str)
df['D'] = df['D'].astype(int)
df['E'] = df['E'].astype(int)

Then check if if values properly converted.

print(df.dtypes)

This should work. Let me know if it does not.

Reference

  • Related