Get only the users that contain a certain list column-CodePudding

I have the following dataframe

df = pd.DataFrame({'Id':['1','2','3'],'List_Origin':[['A','B'],['B','C'],['A','B']]})

How could i only get the ids, that contain only a certain List_Origin, for example 'A','B'. Would appreciate if the solution avoided loops

Wanted end result

end_df = pd.DataFrame({'Id':['1','3'],'List_Origin':[['A','B'],['A','B']]})

CodePudding user response：

You can use apply and check like below:

>>> df[df['List_Origin'].apply(lambda x: x==['A', 'B'] or x==['A,B'])]

   Id   List_Origin
0   1   [A,B]
2   3   [A, B]

CodePudding user response：

Unfortunately, when using lists, you cannot vectorize. You must use a loop.

I am assuming first that you have ['A', 'B'] and not ['A,B'] in the first row:

end_df = df[[x==['A', 'B'] for x in df['List_Origin']]]

output:

  Id List_Origin
0  1      [A, B]
2  3      [A, B]

If, really, you have a mix of ['A', 'B'] and ['A,B'], then use:

end_df = df[[','.join(x)=='A,B' for x in df['List_Origin']]]

output:

  Id List_Origin
0  1       [A,B]
2  3      [A, B]