I want to drop rows from a dataframe, based on the condition that the value of a specific column is in a list. If this is not the case I want the row to be dropped.
Do you have any suggestions? Thanks in advance
As an example, if the value in column 'C' is not inside the list l, I want to drop the entire row
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(20, 4)), columns=list('ABCD'))
l = [4, 6, 23, 45, 79]
CodePudding user response:
df = df[df.apply(lambda x: any(x.isin(l)), axis=1)]
if only one column must be
and if, only C
:
df[df.apply(lambda x: x["C"] in l, axis=1)]
or, if all columns:
df = df[df.apply(lambda x: all(x.isin(l)), axis=1)]
CodePudding user response:
You can try like this:
df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], columns=["a", "b"])
df[(df.a.isin([3]) == False)]
This will only leave those lines in the data frame, whose a-value is not in the list given to isin, i.e. that where it is not 3.