I currently have a df with a column Outliers
. When I do:
df.Outliers.value_counts(dropna = False)
I get:
NaN 2862
1.0 600
0.0 257
However, when I try to display only these rows with:
df.loc[df.Outliers == np.nan] # numpy was imported as np
I get an output of 0 rows. Why are the NaN rows not being recognized as NaN? I have verified that these NaN values are of the type numpy.float64
, so they aren't strings that need to be converted. Why are they not recognized as NaNs sometimes?
CodePudding user response:
Pandas needs help sometimes when working with np.nan
as it isn't always recognized correctly. However, you can use a isna()
to find all columns/rows where there is data that includes a nan
df = pd.DataFrame({
'Column1' : [np.nan, 2, 3, 4],
'Column2' : [1, np.nan, 3, np.nan]
})
df.loc[df['Column1'].isna()]