python pandas json_normalize - How to flatten nested dict - key error in record path-CodePudding

I am working with a .json file from a public API (https://agsi.gie.eu/, for the sake of completeness) but having troubles flattening it using pandas.

The .json I am working with looks like this (excerpt):

json = {'Austria': [{'image': 'iVBORw0KxuOmB00wGjmw4Y3XTA=',
   'short_name': 'astora',
   'name': 'astora GmbH',
   'url': 'http://www.astora.de',
   'eic': '21X000000001160J',
   'facilities': [{'eic': '21W000000000078N',
     'name': 'UGS Haidach (astora)',
     'country': {'code': 'AT', 'name': 'Austria'},
     'type': 'Storage Facility'}],
   'data': {'type': 'SSO',
    'country': {'code': 'AT', 'name': 'Austria'},
    'code': 'EU',
    'name': 'Europe'}},
  {'image': 'iVBORw0KGgoAAAANSUhEUgAAAIwA',
   'short_name': 'GSA',
   'name': 'GSA LLC',
   'url': 'http://www.gsa-services.ru/',
   'eic': '25X-GSALLC-----E',
   'facilities': [{'eic': '25W-SPHAID-GAZ-M',
     'name': 'UGS Haidach (GSA)',
     'country': {'code': 'AT', 'name': 'Austria'},
     'type': 'Storage Facility'}],
   'data': {'type': 'SSO',
    'country': {'code': 'AT', 'name': 'Austria'},
    'code': 'EU',
    'name': 'Europe'}},
  {'image': 'iVBO7OBOjcRx8LfUS fgE/eevQiV /D85wMd0sggG8GgAAAABJRU5ErkJggg==',
   'short_name': 'OMV Gas Storage',
   'name': 'OMV Gas Storage GmbH',
   'url': 'http://www.omv-gas-storage.com',
   'eic': '25X-OMVGASSTORA5',
   'facilities': [{'eic': '21W000000000081Y',
     'name': 'VGS OMV Gas Storage Pool (Tallesbrunn & Schönkirchen / Reyersdorf)',
     'country': {'code': 'AT', 'name': 'Austria'},
     'type': 'Storage Group'}],
   'data': {'type': 'SSO',
    'country': {'code': 'AT', 'name': 'Austria'},
    'code': 'EU',
    'name': 'Europe'}},
  {'image': 'iVBORw0WjUMEoVDAKFYxCoYJRqGAU/y/ EWAAX72IOPqJ0L0AAAAASUVORK5CYII=',
   'short_name': 'RAG Energy Storage',
   'name': 'RAG Energy Storage',
   'url': 'http://www.rag-energy-storage.at/',
   'eic': '23X----100225-1C',
   'facilities': [{'eic': '21W000000000079L',
     'name': 'RAG Storage Pool (Puchkirchen / Haag, Aigelsbrunn, Haidach 5, 7Fields-RAG)',
     'country': {'code': 'AT', 'name': 'Austria'},
     'type': 'Storage Group'}],
   'data': {'type': 'SSO',
    'country': {'code': 'AT', 'name': 'Austria'},
    'code': 'EU',
    'name': 'Europe'}},
  {'image': 'iVBORw0KGgoAAAANXqi7W2jLByR/ToOH22',
   'short_name': 'Uniper Energy Storage (AT)',
   'name': 'Uniper Energy Storage GmbH',
   'url': 'https://www.uniper-energy-storage.com',
   'eic': '21X000000001127H',
   'facilities': [{'eic': '21W000000000057V',
     'name': 'UGS 7 Fields (Uniper Energy Storage) - AT',
     'country': {'code': 'AT', 'name': 'Austria'},
     'type': 'Storage Facility'}],
   'data': {'type': 'SSO',
    'country': {'code': 'AT', 'name': 'Austria'},
    'code': 'EU',
    'name': 'Europe'}}]}

and I am trying to flatten it, such that it looks like this: This is my desired outcome, flattened such that each facility is a row (other country data may have multiple facilities per company name), with the company data added as meta data.

I achieved this up to now by utilizing:

Austria = pd.json_normalize(json, record_path="facilities", meta=["name", "type", "eic"], meta_prefix='company_').drop(["url"], axis=1)

Since some days, this line returns me a KeyError: 'facilities'

KeyError: "Key 'facilities' not found. If specifying a record_path, all elements of data should have the path."

Although the key 'facilities' is in all elements of the data specified. What am I missing?

Any help and hints highly appreciated!

CodePudding user response：

The code below should work for you. Notice the commented changes.

Austria = pd.json_normalize(
    json['Austria'],              # <- Changed here from just `json`
    record_path="facilities",
    meta=["name", "type", "eic"],
    meta_prefix='company_',
    errors='ignore',              # <- Add this to ignore errors
).drop(
    ["url"],
    axis=1,
    errors='ignore',              # <- Add this to ignore errors
)

And this is the result:

	eic	name	type	country.code	country.name	company_name	company_type	company_eic
0	21W000000000078N	UGS Haidach (astora)	Storage Facility	AT	Austria	astora GmbH	nan	21X000000001160J
1	25W-SPHAID-GAZ-M	UGS Haidach (GSA)	Storage Facility	AT	Austria	GSA LLC	nan	25X-GSALLC-----E
2	21W000000000081Y	VGS OMV Gas Storage Pool (Tallesbrunn & Schönkirchen / Reyersdorf)	Storage Group	AT	Austria	OMV Gas Storage GmbH	nan	25X-OMVGASSTORA5
3	21W000000000079L	RAG Storage Pool (Puchkirchen / Haag, Aigelsbrunn, Haidach 5, 7Fields-RAG)	Storage Group	AT	Austria	RAG Energy Storage	nan	23X----100225-1C
4	21W000000000057V	UGS 7 Fields (Uniper Energy Storage) - AT	Storage Facility	AT	Austria	Uniper Energy Storage GmbH	nan	21X000000001127H