Home > Enterprise >  How do I fire multiple javascript events on a webpage using python?
How do I fire multiple javascript events on a webpage using python?

Time:01-10

I am webscraping Glassdoor.com for company reviews using Python.

Currently, I am using Beautiful Soup and grequests. This is working fine for all the fields I need, except for the "Advice to Management" section which only loads in once the Continue Reading button is pressed. See below an example below for this page of reviews:

continue reading button expanded review

There are no changes to the URL as far as I can tell, but there is a JS click-event being fired in the console: Event: EiReviews: Click [continueReading-71858088]

I found a tutorial online for selenium webdriver such as this one, and I wrote this code:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome (executable_path="C:\\chromedriver.exe")
driver.get("https://www.glassdoor.com/Reviews/Alteryx-Reviews-E351220.htm")

btn = driver.find_element(By.CLASS_NAME, "v2__EIReviewDetailsV2__continueReading").click()
driver.execute_script ("arguments[0].click();",btn)

I need something that scales better, as this takes ~20sec to open chrome and click on a singular button. I need to be able to click on every "Continue Reading" button on the page as my end goal is to scrape every review for ~1,000 companies.

CodePudding user response:

By looking at the HTML of the page, you can notice that right before the <div id="Container"> object, there is a script object starting with window.appCache={.... which contains the complete reviews in a dictionary format, for example it contains the text which appears when you click on Continue Reading

"summary":"Great place to work, been here 4  years",
"summaryOriginal":null,"advice":"Don't rush too finish a project"
  • Related