Home > OS >  Filtering list based on array columns to create a [dict(str:list)] result
Filtering list based on array columns to create a [dict(str:list)] result

Time:04-06

My table looks like this(df):

category product_in_cat
cat1 [A,B,C]
cat2 [E,F,G]

"category" is str, and product_in_cat is list type. I have a list:product=[A,B,G] I want to get a final [dict(str:list)] looks like:

[{cat1:[A,B]},{cat2:[G]}]

I think I can use below code:

list1=[]
for inde,row in df.iterrows():
        list1.append.({row['category']:row['product_in_cat'] in product})
        

I know this part is not correct,row['product_in_cat'] in product but I am not sure how to filter out the list column base on the given "product" list. Please help, and thank you in advance!

CodePudding user response:

You can use np.intersect1d to find the common part of two lists:

import numpy as np


df_ = df['product_in_cat'].apply(lambda x: np.intersect1d(x, product).tolist())
l = [{k: v} for k, v in zip(df['category'], df_)]
print(l)

[{'cat1': ['A', 'B']}, {'cat2': ['G']}]

CodePudding user response:

You can use convert each list in the column to a set and use intersection with the external product list:

import pandas as pd

lst = ['A','B','G']
data = {'category':['cat 1','cat 2'],
        'product_in_cat': [ ['A','B','C'] ,['E','F','G']]} 
df = pd.DataFrame(data)
 
dict(zip(df['category'],df['product_in_cat'].apply(lambda x: set(x).intersection(lst))))

#output
{'cat 1': {'A', 'B'}, 'cat 2': {'G'}}
  • Related