If statement with multiple conditions and empty values-CodePudding

This should be a simple one.

I want to create a new column that is populated with Y and N, based on certain conditions:

If a value from id in df1 exists in id in df2, return Y, else N. If a value in id in df1 is NaN, return N. (id column is numeric)

What I have is:

df1['new_col']=df1['id'].isin(df2['id']).replace({False: 'N', True:'Y'})

which is fine, except that I also get a Y for empty values, but I want a N for NaN values.

So I've tried these but it's not working:

df1['new_col']=df1['new_col'].apply(lambda x: 'N' if pd.isnull(x)==True else x['new_col'])

def ss_am(x):
    if x['id'].isin(df2['id']):
        return 'Y'
    if x['id']==pd.isnull(x):
        return 'N'
    else:
        return 'N'
    
df['new_col']=df.apply(ss_am, 1)

exapmle:

CodePudding user response：

Assuming the 'id' column contains strings (since I don't see NaN in the output) and the value is always > 0,

df1['new_col'] = df1['id'].replace('', '-1').isin(df2['id']).replace({False: 'N', True:'Y'})

can work. That temporarily substitutes '-1' for the blanks, which df2 is very unlikely to have.

If the blanks are NaNs instead (and id are integers/floats), you can use fillna() with a suitable value:

df1['new_col'] = df1['id'].fillna(-1).isin(df2['id']).replace({False: 'N', True:'Y'})

CodePudding user response：

This can easily be done by merging 2 df:

# Create expected value in df2
df2['new_col'] = 'y'
# Merge 2 dfs 
output_df = pd.merge(df1, df2, how='outer', on=["id"] )
# Fill missing values
output_df['new_col'].fillna('n', inplace=True)