how to get rid or ignore NaNs? error float when using split-CodePudding

in the y colomn I have words but in some rows there are NaNs so it shows me an error that I can't use split on float because of the nans.

what to change?

column y looks like text1,text2,nun, nun,text3


missing_values = ["#N/A","None","nan"]
df_eval = pd.read_csv('e.csv',na_values = missing_values)


with open('e1.csv', 'a', newline='',errors='ignore', encoding="utf8" ) as f:
        writer = csv.writer(f)
        #df_eval= pd.read_csv(f,na_values = missing_values)
# Importing the dataset

        for j in range(len(df_eval)):
           x=df_eval.iloc[j,1]
           #print("X", x)
#             x1=x.split("|")
# #           print(x1)
           y=df_eval.iloc[j,2]
           #df_new = df[df_eval.iloc[j,2]].notnull()]
            
           if str(y=="nan"):
               continue
           print("y", y)

Thank you in advance

CodePudding user response：

You need to handle the case when x is nan separately, you could either skip the loop using continue as shown below. or set x to be an empty string x = '' if you think that is more appropriate.

...
for j in range(len(df_eval)):
    x=df_eval.iloc[j,1]
    if pd.isna(x):
       continue; 
    ...
 
...

CodePudding user response：

I suspect you just want something like this:

from math import isnan

for row in df.iterrows():
    entries = [x for x in row[1] if not isnan(x)]
    do_stuff(entries)

in other words, since you're iterating anyhow, you might as well just build a standard python type (list) containing the variables you want, and then work with that.

The basic approach---check whether something isnan() or not is more widely applicable.

Note that you can't use .split() on a float, whether it's nan or not.