Home > Blockchain >  Pandas: Populate new column with list from matches on column value
Pandas: Populate new column with list from matches on column value

Time:12-23

I need to create a new column that establishes relationships between entities in a delimited list ( any delimiter works except a comma ).

Dataframe:

df1 = pd.DataFrame(np.array([[1000, 'Jerry', 'BR1','BR1'], 
                             [1001, 'Sal', 'BR2', 'BR1'], 
                             [1002, 'Buck', 'BR3', 'BR2'],
                             [1003, 'Perry','BR4','BR1']]),
                   columns=['ID', 'Name', 'Branch', 'Member of'])

The final result should be:

ID    Name     Branch  Member of  Members
====  ====     ======  =========  =======
1000  Jerry    BR1     BR1        Jerry, Sal, Perry
1001  Sal      BR2     BR1        Buck
1002  Buck     BR3     BR2        NaN
1003  Perry    BR4     BR1        NaN

I need to create the "members" column by finding all matches in "Member of", but returning the "Name", then populate a list that ends up in "members".

Would np.where be a good place to start?

  • np.where(df['Branch'] == df['Member of'], ??, np.nan)

CodePudding user response:

Use groupby to generate the list of members, then merge :

s = df1.groupby('Member of')['Name'].apply(list).rename('Members')

df2 = df1.merge(s, left_on='Branch', right_index=True, how='left')

Output:

     ID   Name Branch Member of              Members
0  1000  Jerry    BR1       BR1  [Jerry, Sal, Perry]
1  1001    Sal    BR2       BR1               [Buck]
2  1002   Buck    BR3       BR2                  NaN
3  1003  Perry    BR4       BR1                  NaN

NB. If you want a string and not a list, use ', '.join in place of list

CodePudding user response:

Try this:

df1['Members'] = df1['Branch'].apply(lambda b: ', '.join(df1.loc[df1['Member of'] == b, 'Name'])).replace('', np.nan)

Output:

>>> df1
     ID   Name Branch Member of            Members
0  1000  Jerry    BR1       BR1  Jerry, Sal, Perry
1  1001    Sal    BR2       BR1               Buck
2  1002   Buck    BR3       BR2                NaN
3  1003  Perry    BR4       BR1                NaN
  • Related