Python Pandas merge list of keys on separate dataframe-CodePudding

Im new to pandas and having trouble trying to merge data on specific 'keys'. I have a dataframe that contains rows, each row contains a column that has a list of keys. Im trying to merge each key if it matches to the column header of a different data frame.

df 1

   index | team_id | players
   0     |  1234   | [333, 444, 555]
   1     | 5678    | [890, 098, 766]

df 2

      | 333   |  766  | 
height| 7"    | 6"    |
weight| 300lbs| 250lbs|

desired result

  index | team_id | players
  0     | 1234    | [(height: 7", weight: 300lbs), 444, 555 ]
  1     | 5678    | [890, 098, (height: 6", weight: 250lbs)]

CodePudding user response：

We can convert the DataFrame df2 to a dictionary using pandas.DataFrame.to_dict. And then using pandas.Series.map go through the list to replace the id of player with the given information. I assume that players contain list of strings, as one of the value I see in the list is 098.

Creating the data

df1 = pd.DataFrame({
    'index' : [0, 1],
    'team_id' : [1234, 5678],
    'players' : [['333', '444', '555'], ['890', '098', '766']]
    })
    

df2 = pd.DataFrame({
    'index' : ['height', 'weight'],
    '333' : ['7"', '6"'],
    '766' : ['300lbs', '250lbs']
    }).set_index('index')

Pre-processing

Converting dataframe to dictionary

df2_dict = df2.to_dict()

Generating Output

Using map to go through the players column and replace the player id with the corresponding value in df2

df1['players'] = df1['players'].map(lambda x : [df2_dict[val] if(df2_dict.get(val)) else val  for val in x])

Output :

This gives us the expected output

   index  team_id                                               players
0      0     1234          [{'height': '7"', 'weight': '6"'}, 444, 555]
1      1     5678  [890, 098, {'height': '300lbs', 'weight': '250lbs'}]