Home > Software engineering >  How to count where column value is falsy in Pandas?
How to count where column value is falsy in Pandas?

Time:05-30

There is a dataframe with a column name ADDRESS:

I try to count how much rows where address is null, false, Nan, None, empty string and etc.

I have tried this:

t = len(new_dfr[new_dfr['ADDRESS'] == ''])
print(r)

How to do that in Pandas?

CodePudding user response:

You can count NA values with isna():

df['ADDRESS'].isna().sum()

This will count all empty, None, NaN, etc. values, but not False. You could replace False with None to cover that:

df['ADDRESS'].replace(False, None).isna().sum()

CodePudding user response:

If I understood you correctly, you basically want to count all the falsy values including the NaNs (note that NaNs are considered truthy). In pandas terms this can be translated into

# (ADDRESS is NaN) OR (ADDRESS is not truthy)
(new_dfr['ADDRESS'].isna() | ~new_dfr['ADDRESS'].astype(bool)).sum()

Example:

new_dfr = pd.DataFrame({
    'ADDRESS': [np.nan, None, False, '', 0, 1, True, 'not empty']
})


>>> new_dfr
     
     ADDRESS
0        NaN
1       None
2      False
3           
4          0
5          1
6       True
7  not empty

>>> new_dfr['ADDRESS'].isna()

0     True
1     True
2    False
3    False
4    False
5    False
6    False
7    False
Name: ADDRESS, dtype: bool

>>> ~new_dfr['ADDRESS'].astype(bool)

0    False
1     True
2     True
3     True
4     True
5    False
6    False
7    False
Name: ADDRESS, dtype: bool

>>> new_dfr['ADDRESS'].isna() | ~new_dfr['ADDRESS'].astype(bool)

0     True
1     True
2     True
3     True
4     True
5    False
6    False
7    False
Name: ADDRESS, dtype: bool

>>> (new_dfr['ADDRESS'].isna() | ~new_dfr['ADDRESS'].astype(bool)).sum()
5
  • Related