Nested Json in to dataframe (pandas)-CodePudding

I`m importing the following json. It is not super extensive but I just added a small piece here for demonstration:

{'request': {'Target': 'Offer',
  'Format': 'json',
  'Service': 'XXXXX',
  'Version': '2',
  'NetworkToken': 'uu238939871',
  'Method': 'findAll',
  'fields': ['preview_url',
   'sub1',
   'sub2',
   'sub3',
   'sub4',
   'sub5',
   'sub6',
   'sub7',
   'sub8',
   'sub9',
   'sub10',
   'sub11',
   'sub12'],
  'filters': {'status': 'active', 'advertiser_id': '626'}},
 'response': {'status': 1,
  'httpStatus': 200,
  'data': {'231': {'Offer': {'sub1': '231',
     'sub2': '626',
     'sub3': 'XXXXXXXX',
     'sub4': 'XXXXXXX',
     'sub5': 'https://play.google.com/store/apps',
     'sub6': 'https://XXXXX',
     'sub7': '0.39',
     'sub8': 'active',
     'sub9': 'None',
     'sub10': '0',
     'sub11': '0',
     'sub12': '1',
     'sub13': 'XXXXX'}},
   '216': {'Offer': {'sub1': '216',
     'sub2': '626',
     'sub3': 'XXXXXXXX',
     'sub4': '682666',
     'sub5': 'https://play.google.com/store/appsg',
     'sub6': 'https://XXXXXXX',
     'sub7': '1.96000',
     'sub8': 'active',
     'sub9': 'None',
     'sub10': '0',
     'sub11': '0',
     'sub12': '1',
     'sub13': 'XXXXX'}}
 'errors': [],
  'errorMessage': None}}

Im trying to get this into a data frame by using df = pd.json_normalize(data) however the only thing Im getting is a df with 1 row and more than 400 columns, which is not ideal. I also tried to use flatten but got an error where it just says 'Data'

from flatten_json import flatten
dic_flattened = (flatten(d, '.') for d in data['data'])
df = pd.DataFrame(dic_flattened)

The final data frame should have each sub in a column and the fields in rows, for example:

Sub1	Sub2	Sub3
231	626	XXXXXXXX
216	626	XXXXXXXX

CodePudding user response：

It's simpler if you reformat the dictionary in Python first before constructing the DataFrame. Since you want the Sub keys, you can simply extract them using a list comprehension:

out = pd.DataFrame([v['Offer'] for v in my_data['response']['data'].values() if isinstance(v, dict)])

Output:

  sub1 sub2      sub3     sub4                                 sub5             sub6     sub7    sub8  sub9 sub10 sub11 sub12  sub13
0  231  626  XXXXXXXX  XXXXXXX   https://play.google.com/store/apps    https://XXXXX     0.39  active  None     0     0     1  XXXXX
1  216  626  XXXXXXXX   682666  https://play.google.com/store/appsg  https://XXXXXXX  1.96000  active  None     0     0     1  XXXXX