Home > front end >  Python selenium headless mode missing elements
Python selenium headless mode missing elements

Time:11-19

I am using selenium to scrape the amazon search results page. As I was wrapping it up, I moved my scraping to headless mode as it will save on efficiency. However in headless mode, certain page elements do not become available such as sponsored brand. It works perfectly fine when using non-headless mode, but fails using headless even after setting the following options:

options = Options()
#options.headless = True
options.add_argument("--window-size=1920,1080")
options.add_argument("--disable-extensions")
options.add_argument("--proxy-server='direct://'")
options.add_argument("--proxy-bypass-list=*")
options.add_argument("--start-maximized")
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--no-sandbox')
options.add_argument('--ignore-certificate-errors')
options.add_argument('--allow-running-insecure-content')
driver = webdriver.Chrome(options=options)

PS: I tried with and without the commented section as well as with just the commented section.

For clarification purposes I screenshotted each example: this is what it looks like when it run it in headless mode and this is what it normally looks like (without headless mode as well as normal user browsing). I am wondering what else needs to be added in order for the sponsored brand information to show up when I run it on headless mode. I am thinking it may be a problem with JavaScript not communicating properly with the browser?

As always, thank you in advance!!

CodePudding user response:

Using the latest Google Chrome v95.0

  • When you use the normal headed browser the following is in use:

    Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36
    
  • Where as when you use the browser the following is in use:

    Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/95.0.4638.69 Safari/537.36
    

The presence of the additional Headless parameter/attribute is intercepted as a . Hence you see the difference.

  • Related