I have the following dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['A','B','A','B','A','B','A','B'],
'Include':[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
'Category':['Cat','Dog','Car','Dog','Bike','Dog','Cat','Bike'],
})
df
I am trying to fill the Include
column with the string yes
if the column Category
does not match the following list:
exluded = ['Car','Bike']
So that my expected output is this:
Any ideas of how to achieve this? THanks!
CodePudding user response:
Try this
df = pd.DataFrame({
'Name': ['A','B','A','B','A','B','A','B'],
'Include':[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
'Category':['Cat','Dog','Car','Dog','Bike','Dog','Cat','Bike'],
})
exluded = ['Car','Bike']
# check the condition and fill if it fails
df.Include = df.Include.where(df.Category.isin(exluded), 'yes')
df
CodePudding user response:
Use loc
and a boolean mask:
df.loc[~df['Category'].isin(exluded), 'Include'] = 'yes'
print(df)
# Output
Name Include Category
0 A yes Cat
1 B yes Dog
2 A NaN Car
3 B yes Dog
4 A NaN Bike
5 B yes Dog
6 A yes Cat
7 B NaN Bike
Alternative with np.where
:
df['Include'] = np.where(df['Category'].isin(exluded), np.nan, 'yes')