There is a dataframe with a column name ADDRESS
:
I try to count how much rows where address is null, false, Nan, None, empty string
and etc.
I have tried this:
t = len(new_dfr[new_dfr['ADDRESS'] == ''])
print(r)
How to do that in Pandas?
CodePudding user response:
You can count NA values with isna():
df['ADDRESS'].isna().sum()
This will count all empty, None, NaN, etc. values, but not False
. You could replace False
with None
to cover that:
df['ADDRESS'].replace(False, None).isna().sum()
CodePudding user response:
If I understood you correctly, you basically want to count all the falsy values including the NaNs (note that NaNs are considered truthy). In pandas terms this can be translated into
# (ADDRESS is NaN) OR (ADDRESS is not truthy)
(new_dfr['ADDRESS'].isna() | ~new_dfr['ADDRESS'].astype(bool)).sum()
Example:
new_dfr = pd.DataFrame({
'ADDRESS': [np.nan, None, False, '', 0, 1, True, 'not empty']
})
>>> new_dfr
ADDRESS
0 NaN
1 None
2 False
3
4 0
5 1
6 True
7 not empty
>>> new_dfr['ADDRESS'].isna()
0 True
1 True
2 False
3 False
4 False
5 False
6 False
7 False
Name: ADDRESS, dtype: bool
>>> ~new_dfr['ADDRESS'].astype(bool)
0 False
1 True
2 True
3 True
4 True
5 False
6 False
7 False
Name: ADDRESS, dtype: bool
>>> new_dfr['ADDRESS'].isna() | ~new_dfr['ADDRESS'].astype(bool)
0 True
1 True
2 True
3 True
4 True
5 False
6 False
7 False
Name: ADDRESS, dtype: bool
>>> (new_dfr['ADDRESS'].isna() | ~new_dfr['ADDRESS'].astype(bool)).sum()
5