Home > Net >  Get all loaded website resources with Selenium
Get all loaded website resources with Selenium

Time:10-19

My goal is to get all external resources loaded by a website like you see in ChromeDevTools > Network.

Is there any easy way to archieve this with Selenium?

CodePudding user response:

Found a solution: Used selenium wire for getting resources

Here is my solution implemented in python

from urllib.parse import urlparse
from seleniumwire import webdriver
from selenium.webdriver.chrome.options import Options

def get_resources(url: str) -> set:
    chrome_options = Options()
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--disable-dev-shm-usage')

    driver = webdriver.Chrome(options=chrome_options)

    resources = []

    driver.get(url)

    # Access requests via the `requests` attribute
    for request in driver.requests:
        if request.response and urlparse(url).netloc not in urlparse(request.url).netloc:
            resources.append(request.url)

    return set(resources)
  • Related