I have a dataframe that contains over 200 columns. I need to create two lists that group columns by type. Rather than creating this list manually I tried different methods that did not work:
numeric_transformer = df.dtypes == 'int64'
or:
numeric_transformer = df.apply(lambda x: x if df.dtypes == 'int64' or 'float64' else
None, axis=1)
or :
categorical_transformer = df.apply(lambda x: x if df[df.dtypes == 'object'] else None,
axis=1)
Can anyone help me solve this problem?
CodePudding user response:
I believe that for this question you can use df.select_dtypes like this :
numeric_transformer = df.select_dtypes(exclude='object').columns #exclude object types
categorical_transfomer = df.select_dtypes(include='object').columns
Another way to do this would be using the list comprehension :
numeric_transformer = [column for column in df.columns if (df.dtypes[column] == 'float') or (df.dtypes[column] == 'int')]
categorical_transformer = [column for column in df.columns if df.dtypes[column] == 'object']