Convert string dictionary in pandas.core.series.Series to dictionary in python-CodePudding

I read my data from excel and saved it in data frame format. One of the columns of the data has data in a dictionary format(same shape but not dictionary format), which is recognized as a string format. So, I want to change the data type of all rows (more than 40k) in that column from string to dictionary format. The when printing out column, the results look like this:

df['fruit']
 0    NaN                            
 1    {'apple': [{'A': 1, 'B': 2, ...
 2    {'apple': [{'A': 3, 'B': 4, ...
 3    {'orange': [{'A': 5, 'B': 6...   
 4    {'apple': [{'A': 0, 'B': 9, ...

If I use that to_dict() to the column, it will be converted as follows.

df['fruit'].to_dict()
{0: NaN,
 1: "{'apple': [{'A': 1, 'end': b, ...}",
 2: "{'apple': [{'A': 3, 'B': 4, ...}",
3: "{'orange': [{'A': 5, 'B': 6...}",
4: "{'apple': [{'A': 0, 'B': 9, ...}",

Then, when using to_dict('list'), I got the following error message.

df['fruit'].to_dict('list')
....
TypeError: unsupported type: <class 'str'>

I want to use the dictionary format because I need only the information corresponding to 'B' in the data corresponding to the 'orange.'

Any help would be greatly appreciated!

CodePudding user response：

Use:

import pandas as pd
df = pd.DataFrame({'string dict':["{'a': 1}", "{'b':2}"]})

df['string dict'].apply(eval)

which can be validated as follows:

type(df['string dict'].apply(eval)[0])

returns:

dict

Based on your comment:

df['string dict'].fillna('{}').apply(eval)

I reproduced your error using the following test data:

df = pd.DataFrame({'string dict':["{'a': 1}", "{'b':2}", np.nan, 2]})