Home > Software engineering >  How can I create a dictionary showing which entries in column a are associated with which entries in
How can I create a dictionary showing which entries in column a are associated with which entries in

Time:10-22

I want to understand the relationships between the variables in columns A and B. They are string variables. For example

df = pd.DataFrame({'First_Name': ['Agatha', 'Agatha','Hercule', 'Hercule'],...                    
                   'Last Name': ['Christie', 'Raisin', 'Poirot', 'Holmes']})

I want some kind of data product that shows me:

Agatha: ['Christie', 'Raisin']
Hercule: ['Poirot', 'Holmes']

I would like to be able to do this without a loop.

CodePudding user response:

df.groupby('First_Name',as_index=False)['Last Name'].agg(list)
    First_Name  Last Name
0   Agatha  [Christie, Raisin]
1   Hercule     [Poirot, Holmes]

with removing duplicates

df.drop_duplicates().groupby('First_Name',as_index=False)['Last Name'].agg(list)
  • Related