My table looks like this(df):
category | product_in_cat |
---|---|
cat1 | [A,B,C] |
cat2 | [E,F,G] |
"category" is str, and product_in_cat is list type. I have a list:product=[A,B,G] I want to get a final [dict(str:list)] looks like:
[{cat1:[A,B]},{cat2:[G]}]
I think I can use below code:
list1=[]
for inde,row in df.iterrows():
list1.append.({row['category']:row['product_in_cat'] in product})
I know this part is not correct,row['product_in_cat'] in product
but I am not sure how to filter out the list column base on the given "product" list. Please help, and thank you in advance!
CodePudding user response:
You can use np.intersect1d
to find the common part of two lists:
import numpy as np
df_ = df['product_in_cat'].apply(lambda x: np.intersect1d(x, product).tolist())
l = [{k: v} for k, v in zip(df['category'], df_)]
print(l)
[{'cat1': ['A', 'B']}, {'cat2': ['G']}]
CodePudding user response:
You can use convert each list in the column to a set and use intersection
with the external product list:
import pandas as pd
lst = ['A','B','G']
data = {'category':['cat 1','cat 2'],
'product_in_cat': [ ['A','B','C'] ,['E','F','G']]}
df = pd.DataFrame(data)
dict(zip(df['category'],df['product_in_cat'].apply(lambda x: set(x).intersection(lst))))
#output
{'cat 1': {'A', 'B'}, 'cat 2': {'G'}}