in the y colomn I have words but in some rows there are nans so it shows me an error that I can't use split on float because of the nans.
what to change?
colom y looks like text1,text2,nan, nan,text3
missing_values = ["#N/A","None","nan"]
df_eval = pd.read_csv('e.csv',na_values = missing_values)
with open('e1.csv', 'a', newline='',errors='ignore', encoding="utf8" ) as f:
writer = csv.writer(f)
#df_eval= pd.read_csv(f,na_values = missing_values)
# Importing the dataset
for j in range(len(df_eval)):
x=df_eval.iloc[j,1]
#print("X", x)
# x1=x.split("|")
# # print(x1)
y=df_eval.iloc[j,2]
#df_new = df[df_eval.iloc[j,2]].notnull()]
if str(y=="nan"):
continue
print("y", y)
Thank you in advance
CodePudding user response:
You need to handle the case when x is nan separately, you could either skip the loop using continue as shown below. or set x to be an empty string x = '' if you think that is more appropriate.
...
for j in range(len(df_eval)):
x=df_eval.iloc[j,1]
if pd.isna(x):
continue;
...
...
CodePudding user response:
I suspect you just want something like this:
from math import isnan
for row in df.iterrows():
entries = [x for x in row[1] if not isnan(x)]
do_stuff(entries)
in other words, since you're iterating anyhow, you might as well just build a standard python type (list) containing the variables you want, and then work with that.
The basic approach---check whether something isnan()
or not is more widely applicable.
Note that you can't use .split()
on a float, whether it's nan
or not.