Home > Net >  convert nested json with integers keys to pandas dataframe
convert nested json with integers keys to pandas dataframe

Time:05-31

I everyone. I am trying to convert a nested Json into a pandas data frame. This is what the JSON looks like:

{0: {'Geographical information': 'Sweden',
  'Geography': 'mentioned',
  'Time': 'not_relevant',
  'annotation': 'quantitative_precise',
  'claim': ' “"',
  'label': 'Mostly true',
  'text': 'Swedish.'},
 1: {'Geographical information': 'Italy',
  'Geography': 'mentioned',
  'Time': 'unclear',
  'annotation': 'quantitative_precise',
  'claim': '',
  'label': 'Mostly false',
  'text': "."},
 2: {'Geography': 'not_relevant',
  'Time': 'unclear',
  'annotation': 'quantitative_vague',
  'claim': ' "”',
  'label': 'False',
  'text': '.'},
 3: {'Geographical information': 'France',
  'Geography': 'mentioned',
  'Time': 'not_relevant',
  'annotation': 'qualitative',
  'claim': ' ',
  'label': 'Mostly false',
  'text': '.'},

Ideally, the resulting df should have the inner dictionary keys (e.g., "Geographical information") as columns and the outer keys (0, 1, 2, etc.) as rows. I am using the pd.json_normalize() function. However, the latter misinterprets the outer keys (since they are all different integers I believe) as columns rather than rows.

CodePudding user response:

Try:

data = {0: {'Geographical information': 'Sweden',
  'Geography': 'mentioned',
  'Time': 'not_relevant',
  'annotation': 'quantitative_precise',
  'claim': ' “"',
  'label': 'Mostly true',
  'text': 'Swedish.'},
 1: {'Geographical information': 'Italy',
  'Geography': 'mentioned',
  'Time': 'unclear',
  'annotation': 'quantitative_precise',
  'claim': '',
  'label': 'Mostly false',
  'text': "."},
 2: {'Geography': 'not_relevant',
  'Time': 'unclear',
  'annotation': 'quantitative_vague',
  'claim': ' "”',
  'label': 'False',
  'text': '.'},
 3: {'Geographical information': 'France',
  'Geography': 'mentioned',
  'Time': 'not_relevant',
  'annotation': 'qualitative',
  'claim': ' ',
  'label': 'Mostly false',
  'text': '.'}}
pd.DataFrame.from_dict(data, orient='index')

As per pandas official documentation https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_dict.html

  • Related