Home > Back-end >  Pandas : How to drop a row where column values match with a specific value (all value are list of va
Pandas : How to drop a row where column values match with a specific value (all value are list of va

Time:10-27

I have a dataframe column where all the values are under a list format (one list per column value with one or multiple items).

I want to delete rows where a specific string is found in these list (the column value can be a 5 items list, if one of the item match with a specific string, then the row has to be dropped)

for row in df:
    for count, item in enumerate(df["prescript"]):
        for element in item:
            if "complementary" in element:
                df.drop(row)

df["prescript"] is the column on which i want to iterate
"complementary" : if that word is find in column value, the row has to be dropped

How can i improve the code above to make it works?

Thanks all

CodePudding user response:

Just mask first the rows which contain the word using Series.apply

word = "complementary"
word_is_in = df["prescript"].apply(lambda list_item: word in list_item)

Then use boolean indexing to select only the rows which don't contain the word by inverting the boolean Series word_is_in

df = df[~word_is_in]

CodePudding user response:

Impractical solution that may trigger some new learning:

df = pd.DataFrame(
    columns="   index    drug                 prescript ".split(),
    data= [
            [       0,      1,     ['a', 's', 'd', 'f'], ],
            [       1,      2,     ['e', 'a', 'e', 'f'], ],
            [       2,      3,               ['e', 'a'], ],
            [       3,      4,   ['a', 'complementary'], ],]).set_index("index", drop=True)

df.loc[
    df['prescript'].explode().replace({'complementary': np.nan}).groupby(level=0).agg(lambda x: ~pd.isnull(x).any())
]
  • Related