Home > Back-end >  Compare each element in list of lists with a column in a dataframe python
Compare each element in list of lists with a column in a dataframe python

Time:05-07

I want to compare each element in a list of lists with a dataframe column. For example,

groups_rids=[[AX1,AX2],[AX6,AX5,AX17]]
df = pd.DataFrame({'rid': [AX1,AX2,AX6,AX5,AX17],
                   'pid': [P2,P0,P3,P9,P13],
                   })

Here group_rids is the list of lists. It has to be compared with rid in df.
Dataset: |rid|pid| |:---- |:------:| |AX1|P2| |AX2|P0| |AX6|P3| |AX5|P9| |AX17|P13|

My result should be: |groups_rids|pid| |:---- |:------:| |[AX1,AX2]|[P2,P0]| |[AX6,AX5,AX17]|[P3,P9,P13]|

For each rid of a list in groups_rids, I want to search df for it and if present, append the corresponding pid The dataset is large. So 3 nested for loops take forever to print result. Is there a way to get the desired result without 3 nested for loops if possible?

CodePudding user response:

Build a dict:

d = df.set_index('rid').to_dict()['pid']

And use it to build the Dataframe:

pd.DataFrame(((x, [d[el] for el in x]) for x in groups_rids), columns=['groups_rid', 'pid'])
         groups_rid            pid
0        [AX1, AX2]       [P2, P0]
1  [AX6, AX5, AX17]  [P3, P9, P13]
  • Related