How python (not pandas) treat NaN within a function?-CodePudding

I have some NaN value in a column within a dataframe. If I used pd.isnull(), I am able to extract these rows - see below

However, if I try to use a function to assign a new value to these null value, it doesn't work - see below. What would be the right way to identify null value inside a function?

I know I could use fillna function to replace all null with an empty space, then modify the if statement to if str(x) == '' but I try to understand how python (not pandas) treat NaN. Thanks

Below is the code:

import pandas as pd

df[df['comment'].isnull()]
def new_column(x):
    if  str(x) is None:
        return 'no value'
    else:
        return x
df['test'] = df.apply(lambda row: new_column(row['comment']),axis=1)

CodePudding user response：

Inside a function, you can use pd.isna:

df['test'] = df['comment'].apply(lambda x: 'no value' if pd.isna(x) else x)

# Or

def new_column(x):
    if pd.isna(x):
        return 'no value'
    else:
        return x

df['test'] = df['comment'].apply(new_column)

but you have many other "vectorized" ways:

Use fillna:

df['test'] = df['comment'].fillna('no value')

Or use np.where:

df['test'] = np.where(df['comment'].isna(), 'no value', df['comment'])

Or where from pandas:

d['test'] = df['comment'].where(df['comment'].notna(), other='no value')