I am trying to read a JSON file where the data are at various level, i.e. Top --> inner --> inner most.
I have tried the pd.json_normalization
, but I don't think it is working. I have attached a screenshot. In that the top most level is "WTFY_Combined"
, and inside it there are three more levels of data. So, out of the levels, I need to read "OccuMa" which is marked in yellow color, and then inside "OccuMa", there are another level of data "OccuCode" and "OccuDesc". I need to read those two levels in two different dataframes.
I know that one way of doing is to take those two in two different JSON files, but in real, I will have such multi level structure to read.
I am trying below code:
import pandas as pd
import json as js
with open ("filepath", "r") as f:
data = js.loads(f.read())
df_flat = pd.json_normalize(data, record_path=['OccuCode'])
df_flat2 = pd.json_normalize(data, record_path=['OccuDesc'])
But, it is not working, its giving "keyerror"
for obvious reason that I am not able to map the data into the dataframe properly.
CodePudding user response:
import pandas as pd
import json as js
with open ("filepath", "r") as f:
json_data = js.loads(f.read())
json_data = json_data['WTYF_Combined']['Extraction']['abc']['OccuMa']
df_data = pd.DataFrame(json_data)[['OccuCode','OccuDesc']]