I am looking for a way how to drop all the rows that contain any value from a list:
Int:
df = pd.DataFrame({
"ID": [
[12, 1383],
[2898, 1871, 223],
[ 2855, 519, 12],
[55, 519],
[1230, 89564, 1247]],
"number":[1,2,3,4,5]
})
lst = [12, 55]
Out:
df = pd.DataFrame({
"ID": [
[2898, 1871, 223],
[1230, 89564, 1247]],
"number":[1,2,3,4,5]
})
I have come up with this solution:
df = [k for k in df['ID'] if not any(j in lst for j in k)]
which only works with these simplified data, but not in other cases, therefore I am looking for some alternative way. Thank you.
CodePudding user response:
Use boolean indexing
with set.isdisjoint
:
df = df[df['ID'].map(set(lst).isdisjoint)]
#list comprehension alternative
#df = df[[set(lst).isdisjoint(x) for x in df['ID']]]
print (df)
ID number
1 [2898, 1871, 223] 2
4 [1230, 89564, 1247] 5
CodePudding user response:
Another possible solution:
mask = df['ID'].map(lambda x: any(y in lst for y in x))
df = df.drop(df[mask].index)
Output:
ID number
1 [2898, 1871, 223] 2
4 [1230, 89564, 1247] 5