How do I print the number of rows dropped while executing the following code in python:
df.dropna(inplace = True)
CodePudding user response:
Use:
np.random.seed(2022)
df = pd.DataFrame(np.random.choice([0,np.nan, 1], size=(10, 3)))
print (df)
0 1 2
0 NaN 0.0 NaN
1 0.0 NaN NaN
2 0.0 0.0 1.0
3 0.0 0.0 NaN
4 NaN NaN 1.0
5 1.0 0.0 0.0
6 1.0 0.0 1.0
7 NaN 0.0 1.0
8 1.0 1.0 NaN
9 1.0 0.0 NaN
You can count missing values before by DataFrame.isna
with DataFrame.any
and sum
:
count = df.isna().any(axis=1).sum()
df.dropna(inplace = True)
print (df)
0 1 2
2 0.0 0.0 1.0
5 1.0 0.0 0.0
6 1.0 0.0 1.0
print (count)
7
Or get difference of size Dataframe before and after dropna
:
orig = df.shape[0]
df.dropna(inplace = True)
count = orig - df.shape[0]
print (df)
0 1 2
2 0.0 0.0 1.0
5 1.0 0.0 0.0
6 1.0 0.0 1.0
print (count)
7
CodePudding user response:
making new data frame with dropped NA values
new_data = data.dropna(axis = 0, how ='any')
len(new_data)