Home > Net >  Delete the rows which has some nan values
Delete the rows which has some nan values

Time:07-31

I want to delete the rows which has Nan values from column 2 until last one. Here is a simple example:

df = pd.DataFrame()
df['index'] = [1, 2, 3, 4,5]
df['buy'] = [ 10, 2, 6, 7,8]
df['val1'] = [1, 2, np.nan, 8, np.nan]
df['val2'] = [10, np.nan, np.nan, 28, np.nan]
df['val3'] = [11, 32, np.nan, 18, np.nan]

Which gives

   index  buy  val1  val2  val3
0      1   10   1.0  10.0  11.0
1      2    2   2.0   NaN  32.0
2      3    6   NaN   NaN   NaN
3      4    7   8.0  28.0  18.0
4      5    8   NaN   NaN   NaN

And here is the output:

    index buy val1 val2 val3
  0     1     10     1     10     11
  1     2     2     2     np.nan     32
  2     4     7     8     28     18

Can you please help me with this?

CodePudding user response:

You can use

df.dropna()

to get

   index  buy  val1  val2  val3
0      1   10   1.0  10.0  11.0
1      2    2   2.0  12.0  32.0
3      4    7   8.0  28.0  18.0

Edit

Given the updated question where all of val1, val2, and val3 must be NaN for the row to be dropped, you can use

df.dropna(subset=['val1', 'val2', 'val3'], how='all')

Which, says only drop the row if all of these 3 values are NaN in the row. This gives

   index  buy  val1  val2  val3
0      1   10   1.0  10.0  11.0
1      2    2   2.0   NaN  32.0
3      4    7   8.0  28.0  18.0

CodePudding user response:

Use iloc isnull all to create the filter:

df[~df.iloc[:, 2:].isnull().all(1)]

   index  buy  val1  val2  val3
0      1   10   1.0  10.0  11.0
1      2    2   2.0   NaN  32.0
3      4    7   8.0  28.0  18.0

Or notnull any:

df[df.iloc[:, 2:].notnull().any(1)]

   index  buy  val1  val2  val3
0      1   10   1.0  10.0  11.0
1      2    2   2.0   NaN  32.0
3      4    7   8.0  28.0  18.0
  • Related