Im new to pandas and having trouble trying to merge data on specific 'keys'. I have a dataframe that contains rows, each row contains a column that has a list of keys. Im trying to merge each key if it matches to the column header of a different data frame.
df 1
index | team_id | players
0 | 1234 | [333, 444, 555]
1 | 5678 | [890, 098, 766]
df 2
| 333 | 766 |
height| 7" | 6" |
weight| 300lbs| 250lbs|
desired result
index | team_id | players
0 | 1234 | [(height: 7", weight: 300lbs), 444, 555 ]
1 | 5678 | [890, 098, (height: 6", weight: 250lbs)]
CodePudding user response:
We can convert the DataFrame df2
to a dictionary using pandas.DataFrame.to_dict
. And then using pandas.Series.map
go through the list to replace the id
of player with the given information. I assume that players contain list of strings, as one of the value I see in the list is 098
.
Creating the data
df1 = pd.DataFrame({
'index' : [0, 1],
'team_id' : [1234, 5678],
'players' : [['333', '444', '555'], ['890', '098', '766']]
})
df2 = pd.DataFrame({
'index' : ['height', 'weight'],
'333' : ['7"', '6"'],
'766' : ['300lbs', '250lbs']
}).set_index('index')
Pre-processing
Converting dataframe to dictionary
df2_dict = df2.to_dict()
Generating Output
Using map to go through the players column and replace the player id with the corresponding value in df2
df1['players'] = df1['players'].map(lambda x : [df2_dict[val] if(df2_dict.get(val)) else val for val in x])
Output :
This gives us the expected output
index team_id players
0 0 1234 [{'height': '7"', 'weight': '6"'}, 444, 555]
1 1 5678 [890, 098, {'height': '300lbs', 'weight': '250lbs'}]