I have a pandas dataframe with a column that is populated by "yes" or "no" strings.
When I do .value_counts()
to this column, i receive the correct distribution.
But, when I run .isna()
it shows that the whole column is NaNs.
I suspect later it creates problems for me.
Example:
df = pd.DataFrame(np.array([[0,1,2,3,4],[40,30,20,10,0], ['yes','yes','no','no','yes']]).T, columns=['A','B','C'])
len(df['C'].isna()) # 5 --> why?!
df['C'].value_counts() # yes : 3, no: 2 --> as expected.
CodePudding user response:
len
gives you the length of the Series (irrespective of its content), not the number of True
values.
Use sum
if you want the count of True
:
df['C'].isna().sum()
# 0