I'm trying to join a ranked recommendation dataframe with a dataframe containing true actions for each user (indicating whether they watched the recommended movie or not). I'd like to first join on id column and then join on recommendations/true_actions for that id. Here's an example:
rec_df: labels_df:
id | recommendation | rank | id | movie 1.| movie2.| movie 3|
---------------------------- --------------------------------
110| movie 1. | 1. | 110| 0. | 1 | 0
110| movie 2. | 2. | --------------------------------
110| movie 3 | 3. | 210| 1. | 0 | 1
---------------------------- --------------------------------
210| movie 1. | 2. |
210| movie 2. | 1. |
210| movie 3 | 3. |
----------------------------
the desired resulting df after join:
id | recommendation | rank | true_action
----------------------------------------
110| movie 1. | 1. | 0 |
110| movie 2. | 2. | 1 |
110| movie 3 | 3. | 0 |
----------------------------------------
210| movie 1. | 2. | 1 |
210| movie 2. | 1. | 0 |
210| movie 3 | 3. | 1 |
----------------------------
Is there an easy way to do this type of join in pandas? thanks a lot for your help in advance!
CodePudding user response:
Just flatten labels_df
before joining:
df1['true_action'] = df1.merge(df2.melt('id', var_name='recommendation'),
on=['id', 'recommendation'], how='left')['value']
print(df1)
# Output
id recommendation rank true_action
0 110 movie 1 1 0
1 110 movie 2 2 1
2 110 movie 3 3 0
3 210 movie 1 2 1
4 210 movie 2 1 0
5 210 movie 3 3 1
Note: I remove all trailing dots to be clearer