Make new column based on only the first value of another column while fill others with 0-CodePudding

I have a data frame and a key, key[1,2,3,4] :

I want to create a new column called response based on the condition that if the arm values are in the key then response is equal to 1 else response is equal to 0. However the trick is that it should be only for the first values only and any repetition of the arm value should yield a response as 0. Just like this :

Animal  Arm  Response
1       2    1  
1       4    1
1       3    1
1       3    0
1       1    1
1       1    0

There can be only a maximum of 4 value having response as 1

This is what i tried :

resp = []    
for i in range(len(df3)):
            for j in key:
                if df['Arm'][i] == j:
                    resp.append(1)
                    break
            else: resp.append(0)


df['Response'] = resp

but i dont know how to make only the first values of the key as 1 and any repition of the values as zero.

Can someone help?

CodePudding user response：

Use Series.isin with DataFrame.duplicated - per both columns for test duplicated values per Animal and Arm, in another words duplicated values of Arm are tested per groups by Animal

I understand this logic from tag group-by.

df['Response'] = (df['Arm'].isin(key) & ~df.duplicated(['Animal','Arm'])).astype(int)
print (df)
   Animal  Arm  Response
0       1    2         1
1       1    4         1
2       1    3         1
3       1    3         0
4       1    1         1
5       1    1         0

Add data for see difference:

key = [1,2,3,4]
df['Response'] = (df['Arm'].isin(key) & ~df.duplicated(['Animal','Arm'])).astype(int)
print (df)
    Animal  Arm  Response
0        1    2         1
1        1    4         1
2        1    3         1
3        1    3         0
4        1    1         1
5        1    1         0
6        2    2         1
7        2    4         1
8        2    3         1
9        2    3         0
10       2    1         1
11       2    1         0

CodePudding user response：

You can use isin combined with duplicated:

df['Response'] = (df['Arm'].isin(key)&~df['Arm'].duplicated()).astype(int)

Or:

df['Response'] = np.where(df['Arm'].isin(key)&~df['Arm'].duplicated(), 1, 0)

Output:

   Animal  Arm  Response
0       1    2         1
1       1    4         1
2       1    3         1
3       1    3         0
4       1    1         1
5       1    1         0

CodePudding user response：

resp = [] 
respDone= []
for i in range(len(df)):
    for j in key:
        if df['Arm'][i] == j and df["Arm"][i] not in respDone:
            resp.append(1)
            respDone.append(df["Arm"][i])
            break
    else: resp.append(0)


df['Response'] = resp