Home > other >  DataFrame from list of string dicts
DataFrame from list of string dicts

Time:08-09

So I have a list where each entry looks something like this:

"{'A': 1, 'B': 2, 'C': 3}"

I am trying to get a dataframe that looks like this

    A   B   C
0   1   2   3
1   4   5   6 
2   7   8   9

But I'm having trouble converting the format into something that can be read into a DataFrame. I know that pandas should automatically convert dicts into dataframes, but since my list elements are surrounded by quotes, it's getting confused and giving me

               0
0  {'A': 1, 'B': 2, 'C': 3}
...

I've tried using using json, concat'ing a list of dataframes, and so on, but to no avail.

CodePudding user response:

eval is not safe. Check this comparison.

Instead use ast.literal_eval:

Assuming this to be your list:

In [572]: l = ["{'A': 1, 'B': 2, 'C': 3}", "{'A': 4, 'B': 5, 'C': 6}"]

In [584]: import ast
In [587]: df = pd.DataFrame([ast.literal_eval(i) for i in l])

In [588]: df
Out[588]: 
   A  B  C
0  1  2  3
1  4  5  6

CodePudding user response:

I would agree with the SomeDude that eval will work like this

pd.DataFrame([eval(s) for s in l])  

BUT, if any user entered data is going into these strings, you should never use eval. Instead, you can convert the single quotes to double quotes and use the following syntax from the json package. This is much safer.

json.loads(u'{"A": 1, "B": 2, "C": 3}')

CodePudding user response:

Take your list of strings, turn it into a list of dictionaries, then construct the data frame using pd.DataFrame.from_records.

>>> l = ["{'A': 1, 'B': 2, 'C': 3}"]
>>> pd.DataFrame.from_records(eval(s) for s in l) 
   A  B  C
0  1  2  3

Check that your input data doesn't include Python code, however, because eval is just to evaluate the input. Using it on something web-facing or something like that would be a severe security flaw.

CodePudding user response:

You can try

lst = ["{'A': 1, 'B': 2, 'C': 3}", "{'A': 1, 'B': 2, 'C': 3}"]


df = pd.DataFrame(map(eval, lst))

# or

df = pd.DataFrame(lst)[0].apply(eval).apply(pd.Series)

# or

df = pd.DataFrame(lst)[0].apply(lambda x: pd.Series(eval(x)))
print(df)

   A  B  C
0  1  2  3
1  1  2  3

CodePudding user response:

Use eval before reading it in dataframe:

pd.DataFrame([eval(s) for s in l])

Or better use ast.literal_eval as @Mayank Porwal's answer says.

Or use json.loads but after making sure its valid json

  • Related