Home > Blockchain >  Python pandas: left join by key and value in list of values:
Python pandas: left join by key and value in list of values:

Time:01-25

df1 = pd.DataFrame({'key': ['A', 'B', 'C', 'D'],
                   'value': [1, 2, 3, 4]})

df2 = pd.DataFrame({'key': ['B', 'D', 'D', 'F'],
                   'list_values': [[2, 4, 6], [4, 8], [1, 3, 5], [7, 9]]})

need to make left join by :

  1. df1['key'] = df2['key']
  2. df1['value'] in df2['list_values']

in output need to be:

   key  value  list_values
0   A     1       Nan
1   B     2     [2, 4, 6]
2   C     3      Nan
3   D     4     [1, 3, 5]

I can merge by key but how I can add a second conditional?

merged_df = df1.merge(df2, left_on='key', right_on='key', how='left')
............ ??

CodePudding user response:

You could add an extra column in df2 and explode it to act as a secondary key for the merge:

out = df1.merge(df2.assign(value=df2['list_values']).explode('value'),
                on=['key', 'value'], how='left')

Alternatively, merge and filter:

out = (df1.merge(df2, on='key', how='left')
          .loc[lambda d: [not isinstance(l, list) or v in l
                          for v, l in zip(d['value'], d['list_values'])]]
       )

Output:

  key value list_values
0   A     1         NaN
1   B     2   [2, 4, 6]
2   C     3         NaN
3   D     4      [4, 8]

CodePudding user response:

You can even just tried to insert column by checking list_values for each key's using lambda.

Code :

df1['list_values'] = df1.apply(lambda r: sum([l for l in list(df2.loc[df2['key']==r.key].list_values) if r.value in l] ,[]), axis=1)

Output:

key value   list_values
0   A   1   []
1   B   2   [2, 4, 6]
2   C   3   []
3   D   4   [4, 8]
  • Related