Home > Software design >  How to drop columns and rows with missing values?
How to drop columns and rows with missing values?

Time:04-30

I've been trying to take a pandas.Dataframe and drop its rows and columns with missing values simultaneously. While trying to use dropna and applying on both axis, I found out that this is no longer supported. So then I tried, using dropna, to drop the columns and then drop the rows and vice versa and obviously, the results come out different as the values no longer reflect the initial state accurately. So to give an example I receive:

pandas.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
                   "toy": [numpy.nan, 'Batmobile', 'Bullwhip'],
                   "weapon": [numpy.nan, 'Boomerang', 'Gun']})

and return:

pandas.DataFrame({"name": ['Batman', 'Catwoman']})

Any help will be appreciated.

CodePudding user response:

Test if all values per columns and for rows use DataFrame.notna with DataFrame.any and DataFrame.loc:

m = df.notna()
df0 = df.loc[m.all(1), m.all()]
print (df0)
      name
1    Batman
2  Catwoman
  • Related