Home > OS >  Fill na in pandas where another column column value is not in list
Fill na in pandas where another column column value is not in list

Time:05-29

I have the following dataframe:

import pandas as pd
import numpy as np
df = pd.DataFrame({ 
     'Name': ['A','B','A','B','A','B','A','B'],
    'Include':[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
    'Category':['Cat','Dog','Car','Dog','Bike','Dog','Cat','Bike'],
    })

df

enter image description here

I am trying to fill the Include column with the string yes if the column Category does not match the following list:

exluded = ['Car','Bike']

So that my expected output is this:

enter image description here

Any ideas of how to achieve this? THanks!

CodePudding user response:

Try this

df = pd.DataFrame({ 
     'Name': ['A','B','A','B','A','B','A','B'],
    'Include':[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
    'Category':['Cat','Dog','Car','Dog','Bike','Dog','Cat','Bike'],
    })

exluded = ['Car','Bike']

# check the condition and fill if it fails
df.Include = df.Include.where(df.Category.isin(exluded), 'yes')
df

enter image description here

CodePudding user response:

Use loc and a boolean mask:

df.loc[~df['Category'].isin(exluded), 'Include'] = 'yes'
print(df)

# Output
  Name Include Category
0    A     yes      Cat
1    B     yes      Dog
2    A     NaN      Car
3    B     yes      Dog
4    A     NaN     Bike
5    B     yes      Dog
6    A     yes      Cat
7    B     NaN     Bike

Alternative with np.where:

df['Include'] = np.where(df['Category'].isin(exluded), np.nan, 'yes')
  • Related