My goal is to get all external resources loaded by a website like you see in ChromeDevTools > Network
.
Is there any easy way to archieve this with Selenium?
CodePudding user response:
Found a solution: Used selenium wire for getting resources
Here is my solution implemented in python
from urllib.parse import urlparse
from seleniumwire import webdriver
from selenium.webdriver.chrome.options import Options
def get_resources(url: str) -> set:
chrome_options = Options()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(options=chrome_options)
resources = []
driver.get(url)
# Access requests via the `requests` attribute
for request in driver.requests:
if request.response and urlparse(url).netloc not in urlparse(request.url).netloc:
resources.append(request.url)
return set(resources)