I am new to coding, using stackoverflow for the first time. Wondering if I can get some help on this here.
I am trying to scrape the total no.of jobs given on this link. https://jobs.bestbuy.com/bby?id=all_jobs&spa=1&s=req_id_num
Following is my code.
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
os.environ['PATH'] = "/Users/monicayadav/PycharmProjects/pythonProject4/selenium/venv/bin"
driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.get('https://jobs.bestbuy.com/bby?id=all_jobs&spa=1&s=req_id_num')
wait = WebDriverWait(driver, 10)
JobCountBESTBUY = wait.until(ec.presence_of_element_located((By.XPATH, "//p[contains(@class, 'font-wt-500 ng-binding')]"))).text
print(JobCountBESTBUY)
Output I am getting
jobs found
Process finished with exit code 0
I am getting only "job found" as a result , but I need this number instead 1,925
CodePudding user response:
Solution 1 - The easier one
Use time.sleep(seconds)
to wait for the page to load the results completely. It's going to be something like the following. Don't forget to import time
.
import time
# ... Removed code for simplicity ...
driver.get('https://jobs.bestbuy.com/bby?id=all_jobs&spa=1&s=req_id_num')
time.sleep(10)
wait = WebDriverWait(driver, 10)
JobCountBESTBUY = wait.until(ec.presence_of_element_located((By.XPATH, "//p[contains(@class, 'font-wt-500 ng-binding')]"))).text
print(JobCountBESTBUY)
Solution 2 - The faster one
On the other hand, time.sleep
spends too much time waiting even though the text is ready already. Another approach is to search for the text itself like the following. The advantage is that as soon as a match is found the wait is over and is possible to return the number directly.
import re
# ... Removed code for simplicity ...
driver.get('https://jobs.bestbuy.com/bby?id=all_jobs&spa=1&s=req_id_num')
WebDriverWait(driver, 10).until(ec.presence_of_element_located((By.XPATH, "//p[contains(@class, 'font-wt-500 ng-binding')]")))
# Matches `1,234`, `1`, `12`, `1,234,567`
r = re.compile(r'^([0-9,] ).*$')
JobCountBESTBUY = WebDriverWait(driver, 10).until(
lambda _: (e := driver.find_element(By.XPATH, "//p[contains(@class, 'font-wt-500 ng-binding')]")) \
and (m := r.match(e.text)) \
and m.group(1)
)
print(JobCountBESTBUY)
Output
1,988