Home > OS >  Iterate through nested JSON object and get values throughout
Iterate through nested JSON object and get values throughout

Time:10-09

Working on a API project, in which I'm trying to get all the redirect urls from an API output such as https://urlscan.io/api/v1/result/39a4fc22-39df-4fd5-ba13-21a91ca9a07d/

Example of where I'm trying to pull the urls from:

"redirectResponse": {
  "url": "https://www.coke.com/"

I currently have the following code:

import requests
import json
import time

#URL to be scanned
url = 'https://www.coke.com'

#URL Scan Headers
headers = {'API-Key':apikey,'Content-Type':'application/json'}
data = {"url":url, "visibility": "public"}
response = requests.post('https://urlscan.io/api/v1/scan/',headers=headers, data=json.dumps(data))

uuid = response.json()['uuid']
responseUrl = response.json()['api']

time.sleep(10)

req = requests.Session()
r = req.get(responseUrl).json()
r.keys()

for value in  r['data']['requests']['redirectResponse']['url']:
    print(f"{value}")

I get the following error: TypeError: list indices must be integers or slices, not str. Not sure what the best way to parse the nested json in order to get all the redirect urls.

CodePudding user response:

A redirectResponse isn't always present in the requests, so the code has to be written to handle that and keep going. In Python that's usually done with a try/except:

for obj in r['data']['requests']:
    try:
        redirectResponse = obj['request']['redirectResponse']
    except KeyError:
        continue  # Ignore and skip to next one.
    url = redirectResponse['url']
    print(f'{url=!r}')

  • Related