I want to delete the rows which has Nan values from column 2 until last one. Here is a simple example:
df = pd.DataFrame()
df['index'] = [1, 2, 3, 4,5]
df['buy'] = [ 10, 2, 6, 7,8]
df['val1'] = [1, 2, np.nan, 8, np.nan]
df['val2'] = [10, np.nan, np.nan, 28, np.nan]
df['val3'] = [11, 32, np.nan, 18, np.nan]
Which gives
index buy val1 val2 val3
0 1 10 1.0 10.0 11.0
1 2 2 2.0 NaN 32.0
2 3 6 NaN NaN NaN
3 4 7 8.0 28.0 18.0
4 5 8 NaN NaN NaN
And here is the output:
index buy val1 val2 val3
0 1 10 1 10 11
1 2 2 2 np.nan 32
2 4 7 8 28 18
Can you please help me with this?
CodePudding user response:
You can use
df.dropna()
to get
index buy val1 val2 val3
0 1 10 1.0 10.0 11.0
1 2 2 2.0 12.0 32.0
3 4 7 8.0 28.0 18.0
Edit
Given the updated question where all of val1
, val2
, and val3
must be NaN for the row to be dropped, you can use
df.dropna(subset=['val1', 'val2', 'val3'], how='all')
Which, says only drop the row if all of these 3 values are NaN in the row. This gives
index buy val1 val2 val3
0 1 10 1.0 10.0 11.0
1 2 2 2.0 NaN 32.0
3 4 7 8.0 28.0 18.0
CodePudding user response:
Use iloc
isnull
all
to create the filter:
df[~df.iloc[:, 2:].isnull().all(1)]
index buy val1 val2 val3
0 1 10 1.0 10.0 11.0
1 2 2 2.0 NaN 32.0
3 4 7 8.0 28.0 18.0
Or notnull
any
:
df[df.iloc[:, 2:].notnull().any(1)]
index buy val1 val2 val3
0 1 10 1.0 10.0 11.0
1 2 2 2.0 NaN 32.0
3 4 7 8.0 28.0 18.0