Home > Enterprise >  How to use np.where nested in data frame with pandas?
How to use np.where nested in data frame with pandas?

Time:05-26

I want to implement a logic to add a custom label based on the following criteria:

if df[(df['value1'] ==0) & (df['value2']==1)] then label1
if df[(df['value1'] ==0) & (df['value2']==0)] then label2
if df[(df['value1'] ==1) & (df['value2']==1)] then label3
if df[(df['value1'] ==1) & (df['value2']==0)] then label4

Out:

label_class | other columns
label1      |...
label1      |...
label3      |...
label2      |...

I tried with np.where but I am not sure how to do the nesting properly. Any advise is appreciated.

CodePudding user response:

Use numpy.select:

m1 = (df['value1'] ==0) & (df['value2']==1)
m2 = (df['value1'] ==0) & (df['value2']==0)
m3 = (df['value1'] ==1) & (df['value2']==1)
m4 = (df['value1'] ==1) & (df['value2']==0)
labels = ['label1', 'label2', 'label3', 'label4']

df['label_class'] = np.select([m1, m2, m3, m4], labels)

Another idea is create helper DataFrame by all combinatons and labels and then add to DataFrame by left join:

df1 = pd.DataFrame({'value1':[0,0,1,1], 'value2':[1,0,1,0], 'label_class':labels})

df = df.merge(df1, on=['value1','value2'], how='left')

Idea with mapping by both columns:

d = {(0, 1): 'label1', (0, 0): 'label2', (1, 1): 'label3', (1, 0): 'label4'}

df['label_class'] = df.set_index(['value1','value2']).index.map(d)

CodePudding user response:

The syntax of np.where is in this way np.where(condition, value_if_true, value_if_false)

In you case, you can do it in this way:

np.where(df[(df['value1'] ==0) & (df['value2']==1)], 'label1',
         np.where(if df[(df['value1'] ==0) & (df['value2']==0)], 'label2',
         np.where(if df[(df['value1'] ==1) & (df['value2']==1)], 'label3',
         np.where(if df[(df['value1'] ==1) & (df['value2']==0)], 'label4', None))))
  • Related