Home > Net >  Convert JSON dictionary to Pandas Dataframe
Convert JSON dictionary to Pandas Dataframe

Time:03-18

I need to convert a JSON dictionary to a Pandas DataFrame, but the embedding is tripping me up.

Here is basically what the JSON dict looks like.

{
  "report": "{'name':{'data':[{'key 1':'value 1','key 2':'value 2'},{'key 1':'value 1','key 2':'value 2'}]}}"
}

In the DataFrame, I want the keys to be the column headers and values in the rows below them.

The extra layer of embedding is throwing me off somewhat from all the usual methods of doing this.

One tricky part is 'name' will change each time I get this JSON dict, so I can't use an exact sting value for 'name'.

CodePudding user response:

Your JSON looks a bit odd. It looks more like a Python dict converted to a string, so you can use ast.literal_eval (a built-in function) to convert it to a real dict, and then use pd.json_normalize to get it into a dataframe form:

import ast

j = ...
parsed_json = ast.literal_eval(j['report'])
df = pd.json_normalize(parsed_json, record_path=[list(parsed_json)[0], 'data'])

Output:

>>> df
     key 1    key 2
0  value 1  value 2
1  value 1  value 2

CodePudding user response:

The error suggests that you're trying to index the strings (because the value under report is a string) using another string.

You just need ast.literal_eval to parse the string and a DataFrame constructor. If the "name" is unknown, you can iterate over the dict.values after you parse the string.

import ast
out = pd.DataFrame(y for x in ast.literal_eval(your_data['report']).values() for y in x['data'])

Output:

     key 1    key 2
0  value 1  value 2
1  value 1  value 2
  • Related