I am wondering how to first define a function (Name: Rower) in Python with a single input, a pandas DataFrame, and that counts how many NaN rows the input has.
I don't know how to start, especially how to define the function in the first place. I am very very new to python and would be happy if you could add also an explanation.
Here is a sample of what I've tried:
def pandasNull(df):
return df.isna().sum().sum()
df = pd.DataFrame( np.random.randn(6,4), index=[1,2,3,4,5,6], columns=['A','B','C','D'] )
example_df = df[df>0]
CodePudding user response:
The exact answer here depends on whether you want to count the rows that have NaN entries in any column or only the rows that have NaN entries in all columns. Below I've defined two functions, one for each operation, and created some sample data.
import pandas as pd
def rower_any(df):
return df.isna().any(axis=1).sum()
def rower_all(df):
return df.isna().all(axis=1).sum()
sample_data = [('a', 1), ('b', 2), ('c',), ()]
sample_df = pd.DataFrame(data=data, columns=('x1', 'x2'))
The dataframe looks like this:
x1 x2
0 a 1.0
1 b 2.0
2 c NaN
3 None NaN
Calling these functions on our sample data, rower_any(sample_df)
= 2, rower_all(sample_df)
= 1.
CodePudding user response:
You could transform the input value as a string and compared it to the str representation of NaN, such as:
if str(row["column_name"]) == "nan":
count = 1