So lets say, I have the following dataframe.
data = pd.DataFrame({'Name': ['RACHEL', 'MONICA', 'PHOEBE', 'ROSS', 'CHANDLER', 'JOEY', 'RACHEL', 'RACHEL'],
'Age': [30, 35, 37, 33, 34, 30, 30, 15],
'Salary': [100000, 93000, 88000, 120000, 94000, 95000, 100000, 10],
'Job': ['DESIGNER', 'CHEF', 'MASUS', 'PALENTOLOGY',
'IT', 'ARTIST', 'DESIGNER', 'CHEF']})
which gives:
Name Age Salary Job
RACHEL 30 100000 DESIGNER
MONICA 35 93000 CHEF
PHOEBE 37 88000 MASUS
ROSS 33 120000 PALENTOLOGY
CHANDLER 34 94000 IT
JOEY 30 95000 ARTIST
RACHEL 30 100000 DESIGNER
RACHEL 15 10 CHEF
What I want to do it pretty simple, I want to filter(get rows) and get rows where Name != 'RACHEL' and Job != 'CHEF';
Expected result set:
Name Age Salary Job
RACHEL 30 100000 DESIGNER
MONICA 35 93000 CHEF
PHOEBE 37 88000 MASUS
ROSS 33 120000 PALENTOLOGY
CHANDLER 34 94000 IT
JOEY 30 95000 ARTIST
RACHEL 30 100000 DESIGNER
Note that the last entry is removed.
What i have tried so far is:
data = data.loc[ (data.Name != 'RACHEL') & (data.Job != 'CHEF') ]
This filters other rows Where Name = "RACHEL" OR Job = "CHEF". I only want to filter the last row where Name = 'RACHEL' and in the same row the Job = "CHEF".
Any help is appreciated. Thanks.
CodePudding user response:
Use this:
data = data.loc[ ~((data.Name == 'RACHEL') & (data.Job == 'CHEF')) ]
You want to remove all the rows that have both Name = RACHEL
and Job = CHEF
. So just write that condition and invert it to filter them out.
CodePudding user response:
rachefs = df[~(df["Name"] == "RACHEL") | ~(df["Job"] == "CHEF")]
The | usually meaning OR turns into an AND because the negatives we use.