I have a csv file looking like this:
COL0;COl1;COL2;COL3;...;COL9999
SomeText0;[-3.45,0.23];[-1.40,0.21];[-1.35,0.13];...;[-1.87,0.12]
SomeText1;[-3.05,0.20];[-0.40,0.01];[-0.05,0.03];...;[-1.65,0.33]
SomeText2;[-0.40,0.03];[-1.00,0.20];[-0.35,0.03];...;[-1.43,0.12]
...
All cells are strings (e.g. "[-3.45,0.23]"
), but I want them to be np.float64
-1d arrays (except COL0
of course)
How do I do this efficiently?
CodePudding user response:
Just read the CSV normally and then use the built-in function ast.literal_eval
to parse the strings into arrays of floats:
import ast
df = pd.read_csv('YOUR FILE.csv', sep=';')
df.loc[:, 'COl1':] = df.loc[:, 'COl1':].apply(lambda col: col.apply(ast.literal_eval).apply(np.asarray))
Output:
>>> df
COL0 COl1 COL2 COL3 COL9999
0 SomeText0 [-3.45, 0.23] [-1.4, 0.21] [-1.35, 0.13] [-1.87, 0.12]
1 SomeText1 [-3.05, 0.2] [-0.4, 0.01] [-0.05, 0.03] [-1.65, 0.33]
2 SomeText2 [-0.4, 0.03] [-1.0, 0.2] [-0.35, 0.03] [-1.43, 0.12]