How to count the column name that having missing-CodePudding

I have a large datasets. I partitioned the data into training and test.

I found the missing values of the independent variable.

I want to calculate the number of columns that have the missing value. in this case, I should get 12 names. I was only able to sum the whole column

Here is my attempt:

finding_missing_values = data.train.isnull().sum()
finding_missing_values

finding_missing_values.sum()

is there a way I can count the number of column that has a missing value?

CodePudding user response：

You wrote

finding_missing_values.sum()

You were looking for

(finding_missing_values > 0).values.sum()

From .values we get a numpy array.

The comparison gives us False / True values, which conveniently are treated as 0 / 1 by .sum()

CodePudding user response：

Take data list to and then count non zero values as follows.

finding_missing_values = (data.train.isnull().sum()).to_list()
number of missing value columns = sum(k>0 for k in finding_missing_values )
print(number of missing value columns)

should Give #