I'm trying to read in a csv file that has a list for each column value. Example:
accuracy_per_item
"[0.2,0.3,0.4]"
"[0.4,0.2,nan]"
While I can read in the column values without nan using:
pd.read_csv('accuracy_per_item.csv', converters={'accuracy_per_item': pd.eval}
However when pd.eval encounters "nan", it returns an error.
--> UndefinedVariableError: name 'nan' is not defined
How can I read in this csv with pandas recognizing "nan" as np.nan?
CodePudding user response:
You can define a customized converter
def handle_nan(x):
x = x.replace('nan', '"nan"')
lst = pd.eval(x)
lst = [np.nan if i == 'nan' else i for i in lst]
return lst
df = pd.read_csv('accuracy_per_item.csv', converters={'accuracy_per_item': handle_nan})
print(df)
accuracy_per_item
0 [0.2, 0.3, 0.4]
1 [0.4, 0.2, nan]