I have a dataframe where there are 2 columns I want to filter and count total number of "null" values for each column.
Tried this code;
chck_nulls = df['account_id'].isnull().sum() | df['customer_id'].isnull().sum()
print (df[chck_nulls])
Getting this error;
chck_nulls = df['account_id'].isnull().sum() | df['customer_id'].isnull().sum()
print (df[chck_nulls])
1
chck_nulls = df['account_id'].isnull().sum() | df['customer_id'].isnull().sum()
2
print (df[chck_nulls])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3079 try:
-> 3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
KeyError: 28671
Sample Data
Customer Name account_id customer_id
Adam null null
Michael null null
Jenkins null null
Expected results;
customer_id 3
account_id 3
Any help would be highly appreciated!
Thanks
CodePudding user response:
chck_nulls_account_id, chck_nulls_customer_id = (df['account_id'].isnull()).sum(), (df['customer_id'].isnull()).sum()
# print(chck_nulls_account_id, chck_nulls_customer_id)
print(f'customer_id\t{chck_nulls_customer_id}')
print(f'account_id\t{chck_nulls_account_id}')
# The following two print statements will give you the df with entries having only null values in each column
print(df[df['account_id'].isnull()])
print(df[df['customer_id'].isnull()])
# The following print statement will give you the df with entries having only null values in both the columns
print(df[(df['account_id'].isnull()) | df['customer_id'].isnull()])
print(chck_nulls)
will give you the correct sum of total null entries in account_id and customer_id columns.