conditional filtering on rows rather than columns-CodePudding

Given a table

	col_0	col_1	col_2
0	a_00	a_01	a_02
1	a_10	nan	a_12
2	a_20	a_21	a_22

If I am returning all rows such col_1 does not contain nan, then it can be easily done by df[df['col_1'].notnull()], which returns

	col_0	col_1	col_2
0	a_00	a_01	a_02
2	a_20	a_21	a_22

If I would like to return all columns such that its 1-th row does not contain nan, what should I do? The following is the result that I want:

	col_0	col_2
0	a_00	a_02
1	a_10	a_12
2	a_20	a_22

I can transpose dataframe, remove rows on transposed dataframe, and transpose back, but it would become inefficient if dataframe is huge. I also tried

df.loc[df.loc[0].notnull()]

but the code gives me an error. Any ideas?

CodePudding user response：

you can use pandas DataFrame.dropna() function for this.

case 1: want to drop all nan values in column wise-

     ex:  df.dropna(axis = 1)

axis = 0 refers to horizontal axis or rows and axis = 1 refers to vertical axis or columns.

case 2: want to drop upto n number of rows-

     ex: df[:n].dropna(axis = 1)

case 2: drop column in set of columns-

     ex: df[["col_1","col_2"]].dropna(axis = 1)

it will drop nan values with in this two columns

note: If you want to make this change permant then use inplace = True (df.dropna(axis=1,inplace = True) or assign the results to another variable (df2 = df.dropna(axis=1)

CodePudding user response：

Boolean indexing with loc along columns axis

df.loc[:, df.iloc[1].notna()]

Result

  col_0 col_2
0  a_00  a_02
1  a_10  a_12
2  a_20  a_22