I have the following dataframe and list:
df = [[[1,2,3],'a'],[[4,5],'b'],[[6,7,8],'c']]
list = [[1,2,3],[4,5]]
And I want to do a inner merge between them, so I can keep the items in common. This will be my result:
df = [[1,2,3],'a'],[[4,5],'b']]
I have been thinking in converting both to strings, but even if I convert my list to string, I haven't been able to merge both of them as the merge function requires the items to be series or dataframes (not strings). This could be a great help!!
Thanks
CodePudding user response:
If I understand you correctly, you want only keep rows from the dataframe where the values (lists) are both in the column and the list:
lst = [[1, 2, 3], [4, 5]]
print(df[df["col1"].isin(lst)])
Prints:
col1 col2
0 [1, 2, 3] a
1 [4, 5] b
DataFrame used:
col1 col2
0 [1, 2, 3] a
1 [4, 5] b
2 [6, 7, 8] c
CodePudding user response:
Thanks for your answer!
This is what worked for me:
- Convert my list to a series (my using DB):
match = pd.Series(','.join(map(str,match)))
- Convert the list of my master DB into a string:
df_temp2['match_s'].loc[m] = ','.join(map(str,df_temp2['match'].loc[m]))
- Applied an inner merge on both DB:
df_temp3 = df_temp2.merge(match.rename('match'), how='inner', left_on='match_s', right_on='match')
Hope it also works for somebody else :)