Home > Enterprise >  How to remove all Pandas rows of lists if they contain specific values?
How to remove all Pandas rows of lists if they contain specific values?

Time:12-09

I am looking for a way how to drop all the rows that contain any value from a list:

Int:

df = pd.DataFrame({
    "ID": [
    [12, 1383],
    [2898, 1871, 223],
    [ 2855, 519, 12],
    [55, 519],
    [1230, 89564, 1247]],
    "number":[1,2,3,4,5]
})
lst = [12, 55]

Out:

df = pd.DataFrame({
    "ID": [
    [2898, 1871, 223],
    [1230, 89564, 1247]],
    "number":[1,2,3,4,5]
})

I have come up with this solution:

df = [k for k in df['ID'] if not any(j in lst for j in k)]

which only works with these simplified data, but not in other cases, therefore I am looking for some alternative way. Thank you.

CodePudding user response:

Use boolean indexing with set.isdisjoint:

df = df[df['ID'].map(set(lst).isdisjoint)]

#list comprehension alternative
#df = df[[set(lst).isdisjoint(x) for x in df['ID']]]
print (df)
                    ID  number
1    [2898, 1871, 223]       2
4  [1230, 89564, 1247]       5

CodePudding user response:

Another possible solution:

mask = df['ID'].map(lambda x: any(y in lst for y in x))
df = df.drop(df[mask].index)

Output:

                    ID  number
1    [2898, 1871, 223]       2
4  [1230, 89564, 1247]       5
  • Related