Home > Net >  How to drop row in pandas if column1 = certain value and column 2 = NaN?
How to drop row in pandas if column1 = certain value and column 2 = NaN?

Time:11-14

I'm trying to do the following: "# drop all rows where tag == train_loop and start is NaN".

Here's my current attempt (thanks Copilot):

# drop all rows where tag == train_loop and start is NaN
# apply filter function to each row
# return True if row should be dropped
def filter_fn(row):
    return row["tag"] == "train_loop" and pd.isna(row["start"]):

old_len = len(df)
df = df[~df.apply(filter_fn, axis=1)]

It works well, but I'm wondering if there is a less verbose way.

CodePudding user response:

using apply is a really bad way to do this actually, since it loops over every row, calling the function you defined in python. Instead, use vectorized functions which you can call on the entire dataframe, which call optimized/vectorized versions written in C under the hood.

df = df[~((df["tag"] == "train_loop") & df["start"].isnull())]

CodePudding user response:

You can do

df = df.loc[~(df['tag'].eq('train_loop') & df['start'].isna())]
  • Related