df1 = pd.DataFrame({'key': ['A', 'B', 'C', 'D'],
'value': [1, 2, 3, 4]})
df2 = pd.DataFrame({'key': ['B', 'D', 'D', 'F'],
'list_values': [[2, 4, 6], [4, 8], [1, 3, 5], [7, 9]]})
need to make left join by :
- df1['key'] = df2['key']
- df1['value'] in df2['list_values']
in output need to be:
key value list_values
0 A 1 Nan
1 B 2 [2, 4, 6]
2 C 3 Nan
3 D 4 [1, 3, 5]
I can merge by key but how I can add a second conditional?
merged_df = df1.merge(df2, left_on='key', right_on='key', how='left')
............ ??
CodePudding user response:
You could add an extra column in df2
and explode
it to act as a secondary key for the merge
:
out = df1.merge(df2.assign(value=df2['list_values']).explode('value'),
on=['key', 'value'], how='left')
Alternatively, merge
and filter:
out = (df1.merge(df2, on='key', how='left')
.loc[lambda d: [not isinstance(l, list) or v in l
for v, l in zip(d['value'], d['list_values'])]]
)
Output:
key value list_values
0 A 1 NaN
1 B 2 [2, 4, 6]
2 C 3 NaN
3 D 4 [4, 8]
CodePudding user response:
You can even just tried to insert column by checking list_values for each key's using lambda.
Code :
df1['list_values'] = df1.apply(lambda r: sum([l for l in list(df2.loc[df2['key']==r.key].list_values) if r.value in l] ,[]), axis=1)
Output:
key value list_values
0 A 1 []
1 B 2 [2, 4, 6]
2 C 3 []
3 D 4 [4, 8]