Home > OS >  How to create a new column in pandas dataframe based on a condition?
How to create a new column in pandas dataframe based on a condition?

Time:05-20

I have a data frame with the following columns:

d = {'find_no': [1, 2, 3], 'zip_code': [32351, 19207, 8723]}
df = pd.DataFrame(data=d)

When there are 5 digits in the zip_code column, I want to return True. When there are not 5 digits, I want to return the "find_no". Sample output would have the results in an added column to the dataframe, corresponding to the row it's referencing.

CodePudding user response:

You could try np.where:

import numpy as np

df['result'] = np.where(df['zip_code'].astype(str).str.len() == 5, True, df['find_no'])

Only downside with this approach is that NumPy will convert your True values to 1's, which could be confusing. An approach to keep the values you want is to do

import numpy as np

df['result'] = np.where(df['zip_code'].astype(str).str.len() == 5, 'True', df['find_no'].astype(str))

The downside here being that you lose the meaning of those values by casting them to strings. I guess it all depends on what you're hoping to accomplish.

  • Related