Home > Blockchain >  Null values present but not detected
Null values present but not detected

Time:10-27

In the following code, the column df2['age_groups'] has definitely one null value.

I am trying to append this null value to a list, but this list turns out to be empty.

Why am I running into this problem?

import numpy as np
np.random.seed(100)
df1 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df2 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})

df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)
null_list=[]
for i in df2['age_groups']:
    if i == float('nan'):
        null_list.append(i)
print(null_list) #empty list
print(df2['age_groups'].isna().sum()) # it shows that there is one null value

and

 type(i) == float('nan') 

generates the same outcome

CodePudding user response:

You only need to fix the if condition. See also this SO question.

Try this:

np.random.seed(100)
df1 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df2 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})

df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)


null_list=[]
for i in df2['age_groups']:
    if i is np.nan: # <- code changed here to np.nan
        null_list.append(i)
        
print(null_list) 
print(df2['age_groups'].isna().sum())

Output:

[nan]
1

CodePudding user response:

For test missing values NaN and None (obviously same processing in pandas) is used pd.isna, not is, not ==:

null_list=[]
for i in df2['age_groups']:
    if pd.isna(i):
        null_list.append(i)
  • Related