I want to subset my df based on the condition of 2 columns: Date and Person
Condition:
Date must be NON-BLANK except when Person =='Peter'
Tried the code but doesn't work, it removes ALL rows with blank in Date
df= df[ (df[df.columns[1]]!='peter') & (pd.isnull (df[df.columns[0]])!=True)]
INPUT
Date | Person |
| tom |
| peter |
| jack |
2010-10-4 | harry |
2010-12-4 | sarah |
DESIRED OUTPUT
Date | Person |
| peter |
2010-10-4 | harry |
2010-12-4 | sarah |
CodePudding user response:
Chain condition by |
for bitwise OR
by columns names:
df = df[df['Person'].eq('peter') | df['Date'].notna()]
Or by positions - selected columns by DataFrame.iloc
:
df = df[df.iloc[:, 1].eq('peter') | df.iloc[:, 0].notna()]
print (df)
Date Person
1 NaN peter
3 2010-10-4 harry
4 2010-12-4 sarah
CodePudding user response:
You need to use |
(boolean OR):
# keep if peter OR if the data is NOT null
df[df[df.columns[1]].eq('peter') | ~df[df.columns[0]].isnull()]