At each NaN value, drop the row and column it's located in from pandas DataFrame-CodePudding

I have some unknown DataFrame that can be of any size and shape, for example:

   first1  first2  first3  first4
a     NaN      22    56.0      65
c   380.0      40     NaN      66
b   390.0      50    80.0      64

My objective is to delete all columns and rows at which there is a NaN value. In this specific case, the output should be:

   first2  first4
b      50      64

Also, I need to preserve the option to use "all" like in pandas.DataFrame.dropna, meaning when an argument "all" passed, a column or a row must be dropped only if all its values are missing.

When I tried the following code:

def dropna_mta_style(df, how='any'):
  new_df = df.dropna(axis=0, how = how).dropna(axis=1, how = how)
  return new_df

It obviously didn't work, because it drops firstly the rows, and then searches for columns with Nan's, but it was already dropped.

Thanks in advance!

P.S: for and while loops, python built-in functions that act on iterables (all, any, map, ...), list and dictionary comprehensions shouldn't be used.

CodePudding user response：

Would something like this work ?

df.dropna(axis=1,how='any').loc[df.dropna(axis=0,how='any').index]

(Meaning we take the indexes of all rows for which we dont have NaNs in any row df.dropna(axis=0,how='any').index - then use that to locate the rows we want from the original df for which we drop all columns having at least one NaN)

CodePudding user response：

This should remove all rows and columns dynamically

df['Check'] = df.isin([np.nan]).any(axis=1)
df = df.dropna(axis = 1)
df = df.loc[df['Check'] == False]
df.drop('Check', axis = 1, inplace = True)
df

CodePudding user response：

Solution intended for readability:

rows = df.dropna(axis=0).index
cols = df.dropna(axis=1).columns
df = df.loc[rows, cols]