So I have a list where each entry looks something like this:
"{'A': 1, 'B': 2, 'C': 3}"
I am trying to get a dataframe that looks like this
A B C
0 1 2 3
1 4 5 6
2 7 8 9
But I'm having trouble converting the format into something that can be read into a DataFrame. I know that pandas should automatically convert dicts into dataframes, but since my list elements are surrounded by quotes, it's getting confused and giving me
0
0 {'A': 1, 'B': 2, 'C': 3}
...
I've tried using using json, concat'ing a list of dataframes, and so on, but to no avail.
CodePudding user response:
eval
is not safe. Check this comparison
.
Instead use ast.literal_eval
:
Assuming this to be your list:
In [572]: l = ["{'A': 1, 'B': 2, 'C': 3}", "{'A': 4, 'B': 5, 'C': 6}"]
In [584]: import ast
In [587]: df = pd.DataFrame([ast.literal_eval(i) for i in l])
In [588]: df
Out[588]:
A B C
0 1 2 3
1 4 5 6
CodePudding user response:
I would agree with the SomeDude that eval will work like this
pd.DataFrame([eval(s) for s in l])
BUT, if any user entered data is going into these strings, you should never use eval. Instead, you can convert the single quotes to double quotes and use the following syntax from the json package. This is much safer.
json.loads(u'{"A": 1, "B": 2, "C": 3}')
CodePudding user response:
Take your list of strings, turn it into a list of dictionaries, then construct the data frame using pd.DataFrame.from_records
.
>>> l = ["{'A': 1, 'B': 2, 'C': 3}"]
>>> pd.DataFrame.from_records(eval(s) for s in l)
A B C
0 1 2 3
Check that your input data doesn't include Python code, however, because eval
is just to evaluate the input. Using it on something web-facing or something like that would be a severe security flaw.
CodePudding user response:
You can try
lst = ["{'A': 1, 'B': 2, 'C': 3}", "{'A': 1, 'B': 2, 'C': 3}"]
df = pd.DataFrame(map(eval, lst))
# or
df = pd.DataFrame(lst)[0].apply(eval).apply(pd.Series)
# or
df = pd.DataFrame(lst)[0].apply(lambda x: pd.Series(eval(x)))
print(df)
A B C
0 1 2 3
1 1 2 3
CodePudding user response:
Use eval
before reading it in dataframe:
pd.DataFrame([eval(s) for s in l])
Or better use ast.literal_eval
as @Mayank Porwal's answer says.
Or use json.loads
but after making sure its valid json