In the following code, the column df2['age_groups']
has definitely one null value.
I am trying to append this null value to a list, but this list turns out to be empty.
Why am I running into this problem?
import numpy as np
np.random.seed(100)
df1 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df2 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)
null_list=[]
for i in df2['age_groups']:
if i == float('nan'):
null_list.append(i)
print(null_list) #empty list
print(df2['age_groups'].isna().sum()) # it shows that there is one null value
and
type(i) == float('nan')
generates the same outcome
CodePudding user response:
You only need to fix the if
condition. See also this SO question.
Try this:
np.random.seed(100)
df1 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df2 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)
null_list=[]
for i in df2['age_groups']:
if i is np.nan: # <- code changed here to np.nan
null_list.append(i)
print(null_list)
print(df2['age_groups'].isna().sum())
Output:
[nan]
1
CodePudding user response:
For test missing values NaN
and None
(obviously same processing in pandas) is used pd.isna
, not is
, not ==
:
null_list=[]
for i in df2['age_groups']:
if pd.isna(i):
null_list.append(i)