Home > other >  Parsing nested JSON variable with Python3
Parsing nested JSON variable with Python3

Time:06-05

I'm looking for the most convenient way to parse the 'value' values out of this JSON output from an API call. I used json.dumps to get this output:

    {
  "listEvents": {
    "operationResult": "SUCCESS",
    "responseDateTime": "2022-06-04T00:40:10.244-05:00",
    "page": {
      "currentPage": 1,
      "pageSize": 10,
      "totalPage": 1,
      "totalResults": 5
    },
    "searchResults": {
      "nameValuePairs": [
        {
          "nameValuePair": [
            {
              "name": "Event Name",
              "value": "Basic Editing: Final Cut Pro X"
            },
            {
              "name": "Event Start Time",
              "value": "18:30:00"
            },
            {
              "name": "Event End Time",
              "value": "21:30:00"
            },
            {
              "name": "Event ID",
              "value": "1900"
            },
            {
              "name": "Event Start Date",
              "value": "2022-06-13"
            },
            {
              "name": "Event End Date",
              "value": "2022-06-14"
            }
          ]
        },
        {
          "nameValuePair": [
            {
              "name": "Event Name",
              "value": "Basic Studio: Camera"
            },
            {
              "name": "Event Start Time",
              "value": "18:30:01"
            },
            {
              "name": "Event End Time",
              "value": "20:30:01"
            },
            {
              "name": "Event ID",
              "value": "1855"
            },
            {
              "name": "Event Start Date",
              "value": "2022-06-07"
            },
            {
              "name": "Event End Date",
              "value": "2022-06-07"
            }
          ]
        },
        {
          "nameValuePair": [
            {
              "name": "Event Name",
              "value": "Field Camera: HC-X1"
            },
            {
              "name": "Event Start Time",
              "value": "18:30:01"
            },
            {
              "name": "Event End Time",
              "value": "21:30:01"
            },
            {
              "name": "Event ID",
              "value": "1885"
            },
            {
              "name": "Event Start Date",
              "value": "2022-06-22"
            },
            {
              "name": "Event End Date",
              "value": "2022-06-23"
            }
          ]
        },
        {
          "nameValuePair": [
            {
              "name": "Event Name",
              "value": "Final Cut Pro X: Advanced Editing"
            },
            {
              "name": "Event Start Time",
              "value": "18:30:00"
            },
            {
              "name": "Event End Time",
              "value": "21:30:00"
            },
            {
              "name": "Event ID",
              "value": "1915"
            },
            {
              "name": "Event Start Date",
              "value": "2022-06-15"
            },
            {
              "name": "Event End Date",
              "value": "2022-06-15"
            }
          ]
        },
        {
          "nameValuePair": [
            {
              "name": "Event Name",
              "value": "Orientation"
            },
            {
              "name": "Event Start Time",
              "value": "10:00:01"
            },
            {
              "name": "Event End Time",
              "value": "12:00:01"
            },
            {
              "name": "Event ID",
              "value": "1840"
            },
            {
              "name": "Event Start Date",
              "value": "2022-06-18"
            },
            {
              "name": "Event End Date",
              "value": "2022-06-18"
            }
          ]
        }
      ]
    }
  }
}

I get this error when I try to loop through it:

 for item in events['listEvents']['searchResults']['nameValuePairs']
 ['nameValuePair']:
   print(item['value'])

 TypeError: string indices must be integers

I understand that the error occurred because item is a string value but I'm not sure how to approach this. If I forego json.dumps and add only the following indices, it will let me parse the value, but I don't know how I could do this for all of them:

item = (events['listEvents']['searchResults']['nameValuePairs'][0]['nameValuePair'] 
[0]['value'])
  print(item)

Basic Editing: Final Cut Pro X

What do I need to do here?

CodePudding user response:

The object at nameValuePairs is a list of dicts, which is why you get an error. Look carefully at the structure.

Here's how you can access each individual dict:

In [3]: searchResults = events["listEvents"]["searchResults"]

In [4]: for nameValuePairDictList in searchResults["nameValuePairs"]:
   ...:     for nameValuePair in nameValuePairDictList["nameValuePair"]:
   ...:         print(nameValuePair)
   ...:
   ...:
{'name': 'Event Name', 'value': 'Basic Editing: Final Cut Pro X'}
{'name': 'Event Start Time', 'value': '18:30:00'}
{'name': 'Event End Time', 'value': '21:30:00'}
{'name': 'Event ID', 'value': '1900'}
{'name': 'Event Start Date', 'value': '2022-06-13'}
{'name': 'Event End Date', 'value': '2022-06-14'}
{'name': 'Event Name', 'value': 'Basic Studio: Camera'}
{'name': 'Event Start Time', 'value': '18:30:01'}
{'name': 'Event End Time', 'value': '20:30:01'}
{'name': 'Event ID', 'value': '1855'}
{'name': 'Event Start Date', 'value': '2022-06-07'}
{'name': 'Event End Date', 'value': '2022-06-07'}
{'name': 'Event Name', 'value': 'Field Camera: HC-X1'}
{'name': 'Event Start Time', 'value': '18:30:01'}
{'name': 'Event End Time', 'value': '21:30:01'}
{'name': 'Event ID', 'value': '1885'}
{'name': 'Event Start Date', 'value': '2022-06-22'}
{'name': 'Event End Date', 'value': '2022-06-23'}
{'name': 'Event Name', 'value': 'Final Cut Pro X: Advanced Editing'}
{'name': 'Event Start Time', 'value': '18:30:00'}
{'name': 'Event End Time', 'value': '21:30:00'}
{'name': 'Event ID', 'value': '1915'}
{'name': 'Event Start Date', 'value': '2022-06-15'}
{'name': 'Event End Date', 'value': '2022-06-15'}
{'name': 'Event Name', 'value': 'Orientation'}
{'name': 'Event Start Time', 'value': '10:00:01'}
{'name': 'Event End Time', 'value': '12:00:01'}
{'name': 'Event ID', 'value': '1840'}
{'name': 'Event Start Date', 'value': '2022-06-18'}
{'name': 'Event End Date', 'value': '2022-06-18'}

CodePudding user response:

I'd recommend to create a list of dictionaries from the search results where you use name as key and value as a value.

If dct is your data from the question, then:

out = []
for pair in dct["listEvents"]["searchResults"]["nameValuePairs"]:
    tmp = {}
    for p in pair["nameValuePair"]:
        tmp[p["name"]] = p["value"]
    out.append(tmp)



for event in out:
    print(
        "{:<40} {:<10} {:<10}".format(
            event["Event Name"],
            event["Event Start Time"],
            event["Event End Time"],
        )
    )

Prints:

Basic Editing: Final Cut Pro X           18:30:00   21:30:00  
Basic Studio: Camera                     18:30:01   20:30:01  
Field Camera: HC-X1                      18:30:01   21:30:01  
Final Cut Pro X: Advanced Editing        18:30:00   21:30:00  
Orientation                              10:00:01   12:00:01  

You can crate pandas DataFrame from the data easily then:

import pandas as pd

df = pd.DataFrame(out)
print(df)

Prints:

                          Event Name Event Start Time Event End Time Event ID Event Start Date Event End Date
0     Basic Editing: Final Cut Pro X         18:30:00       21:30:00     1900       2022-06-13     2022-06-14
1               Basic Studio: Camera         18:30:01       20:30:01     1855       2022-06-07     2022-06-07
2                Field Camera: HC-X1         18:30:01       21:30:01     1885       2022-06-22     2022-06-23
3  Final Cut Pro X: Advanced Editing         18:30:00       21:30:00     1915       2022-06-15     2022-06-15
4                        Orientation         10:00:01       12:00:01     1840       2022-06-18     2022-06-18
  • Related