Home > Mobile >  Converting Pandas column data with np.where() not working as it should
Converting Pandas column data with np.where() not working as it should

Time:07-13

I am using np.where() in combination with a Pandas DataFrame to keep the column as-is if it contains the phrase "no restriction", and make it a float if not. Here is my code:

main_df[col] = np.where(main_df[col].str.contains('no restriction', case=False, na=False, regex=False),
                        main_df[col],
                        main_df[col].apply(lambda x: float(x)))

Here is the error I am getting on a cell that contains the string "No restriction":

    140     main_df[col] = np.where(main_df[col].str.contains('no restriction', case=False, na=False, regex=False),    
    141                             main_df[col],
--> 142                             main_df[col].apply(lambda x: float(x)))


ValueError: could not convert string to float: 'No restriction'

It looks like the Series.str.contains() isn't detecting a cell that contains the string 'No restriction'. What am I doing wrong?

CodePudding user response:

The problem is that main_df[col].apply(lambda x: float(x)) still converts the whole series, including 'no restriction', which obviously fails and throws that error. You can use pd.to_numeric with errors='coerce' option:

main_df[col] = pd.to_numeric(main_df[col], errors='coerce').fillna(main_df[col])

The question is why, though? You should not be mixing float with str in a column.

  • Related