With `.to_dict()`, pandas converts `NaN` to `nan` in python dictionary ; How to check if 'nan&#-CodePudding

Here is a sample CSV file:

X,Y
A,
B,
C,D

After reading this file, pandas treats empty cell as NaN values:

df = pd.read_csv("test.csv")

  X   Y
0 A NaN
1 B NaN
2 C   D

and after converting this to lists of python dictionaries, NaN becomes nan

>> d = df.to_dict(orient='records')

[{'X': 'A', 'Y': nan}, {'X': 'B', 'Y': nan}, {'X': 'C', 'Y': 'D'}]

I'm trying to find where null exits using math.isnan(), but it throws an exception

for i,v in enumerate(d):
    if math.isnan(v['Y']):
        print(I)

0
1
TypeError: must be real number, not str

Exception can be handle using

for i,v in enumerate(d):
    try:
        if math.isnan(v['Y']):
            print(i)
    except:
        pass

But is there a better way to find nan values??

CodePudding user response：

IIUC use pandas.isna:

for i,v in enumerate(d):
    if pd.isna(v['Y']):
        print('I')

CodePudding user response：

You don't need any function, you can compare the value to itself. NaNs have this interesting property to be different from themselves:

for i,v in enumerate(d):
    if v['Y']!=v['Y']:
        print(i)

output

0
1

Another possibility would be to generate this list from the DataFrame itself (note that you would get the real indices here, so 0, 1 if this is a range index otherwise the first and second values of the index):

s = df['Y'].isna()
na_indices = s[s].index.to_list()

output: [0, 1]