Home > Enterprise >  New column for each element in a list
New column for each element in a list

Time:07-26

I have a dataset with many names. I want to create a new column for each of certain names, with 1 if it's the same name, and 0 if not.

Original data:

enter image description here

Desired output:

enter image description here

I've tried the following:

names=['Tom','Sarah','Bob']

def function(x):
    for n in names:
        if (x['Name']==n):
            return 1
        else:
            return 0
        
for n in names:        
    df[n]=df.apply(function,axis=1)

This doesn't work because it returns the 'Tom' column for all names:

enter image description here

What am I doing wrong?

CodePudding user response:

You can just do get_dummies

out = df.join(df.Name.str.get_dummies()[names])

CodePudding user response:

You needn't the for loop in your function.

You can use

names = ['Tom','Sarah','Bob']

for n in names:
    df[n] = df['Name'].eq(n).astype(int)

Or with numpy broadcasting

df[names] = (df[['Name']].values == names).astype(int)
  • Related