I everyone. I am trying to convert a nested Json into a pandas data frame. This is what the JSON looks like:
{0: {'Geographical information': 'Sweden',
'Geography': 'mentioned',
'Time': 'not_relevant',
'annotation': 'quantitative_precise',
'claim': ' “"',
'label': 'Mostly true',
'text': 'Swedish.'},
1: {'Geographical information': 'Italy',
'Geography': 'mentioned',
'Time': 'unclear',
'annotation': 'quantitative_precise',
'claim': '',
'label': 'Mostly false',
'text': "."},
2: {'Geography': 'not_relevant',
'Time': 'unclear',
'annotation': 'quantitative_vague',
'claim': ' "”',
'label': 'False',
'text': '.'},
3: {'Geographical information': 'France',
'Geography': 'mentioned',
'Time': 'not_relevant',
'annotation': 'qualitative',
'claim': ' ',
'label': 'Mostly false',
'text': '.'},
Ideally, the resulting df should have the inner dictionary keys (e.g., "Geographical information") as columns and the outer keys (0, 1, 2, etc.) as rows. I am using the pd.json_normalize()
function. However, the latter misinterprets the outer keys (since they are all different integers I believe) as columns rather than rows.
CodePudding user response:
Try:
data = {0: {'Geographical information': 'Sweden',
'Geography': 'mentioned',
'Time': 'not_relevant',
'annotation': 'quantitative_precise',
'claim': ' “"',
'label': 'Mostly true',
'text': 'Swedish.'},
1: {'Geographical information': 'Italy',
'Geography': 'mentioned',
'Time': 'unclear',
'annotation': 'quantitative_precise',
'claim': '',
'label': 'Mostly false',
'text': "."},
2: {'Geography': 'not_relevant',
'Time': 'unclear',
'annotation': 'quantitative_vague',
'claim': ' "”',
'label': 'False',
'text': '.'},
3: {'Geographical information': 'France',
'Geography': 'mentioned',
'Time': 'not_relevant',
'annotation': 'qualitative',
'claim': ' ',
'label': 'Mostly false',
'text': '.'}}
pd.DataFrame.from_dict(data, orient='index')
As per pandas official documentation https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_dict.html