I have a dataframe where one of the columns contains a list of values:
example:
type(df['col_list'].values[0])
= list
I saved this dataframe as csv file (df.to_csv('my_file.csv')
)
When I load the dataframe (df = pd.read_csv('my_file.csv')
)
the column which contains list of values change to string
type:
type(df['col_list'].values[0])
= str
When converting to list (list(df['col_list'].values[0]
) I'm getting list of characters instead of list of values.
How can I save/load dataframe which one of it's columns contains list of values ?
CodePudding user response:
This is due to the table being saved as CSV and serializing the values of the list. The csv format is unable to save the list object as it is. Try saving in another format df.to_pickle('test.df')
. You can then read this back into a dataframe with read_pickle
Read more on saving to pickle here
CodePudding user response:
Use JSON
or HDF
file format instead of CSV. CSV file format is really inconvenient for storing a list or a collection of objects.
CodePudding user response:
I think Anurag's suggestion is very good. But just in case you want to keep things the way it is, this will do the job
import json
df['col_list'] = df['col_list'].apply(json.loads)
This would work better if you had converted col_list
into JSON text before pd.to_csv
by
df['col_list'] = df['col_list'].apply(json.dumps)