I am facing a problem when I need to exclude some rows from by dataframe.
here the code:
import numpy as np
import pandas as pd
dff = {
'ID': [1, 2, 3, 4, 5, 6, 7],
'Default': [1,1,0,1,0,1,0],
'Default_Amount': [1200,2000,0,350,0,760,0],
'Tot_Amount': [1200,2000,3400,350,10000,760,7500],
'Time' : ['October','March','November','November','January','December','January'],
'Class': ['A','B','A','A','B','B','A']
}
dff = pd.DataFrame(dff)
display(dff)
dff[(dff.Time != 'November') & (dff.Default == 1) ]
What I am trying to do, is to exclude the row in dataframe that has the following two conditions: time = November with default = 1 (that is the ID 4 in the dataframe).
But if I execute this code "dff[(dff.Time != 'November') & (dff.Default == 1) ]
", it excludes also the other time = "November" and default = 0.
How can I avoid this problem?
Thanks in advance!
CodePudding user response:
You need match not equal dff.Default != 1
with bitwise OR
by |
:
df = dff[(dff.Time != 'November') | (dff.Default != 1) ]
Or invert mask, but change |
to &
for bitwise AND
and change !=
to ==
:
df = dff[~((dff.Time == 'November') & (dff.Default == 1)) ]
CodePudding user response:
This is simple boolean algebra
Either follow De Morgan's law and invert both conditions while getting an OR (|
):
dff[(dff.Time != 'November') | (dff.Default != 1) ]
or add (&
) your two conditions, and negate (~
):
dff[~((dff.Time == 'November') & (dff.Default == 1)) ]
CodePudding user response:
Why don't you simply use
dff[dff.ID != 4]
or otherwise you can use
dff[~((dff.Time == 'November') & (dff.Default == 1))]
The problem with you code is that
(dff.Time != 'November')
first selects all non-November rows and then (dff.Default == 1)
selects all row with Default '1' only.