Home > Blockchain >  Scraping dynamic webpage using Python
Scraping dynamic webpage using Python

Time:03-01

I am trying to scrape following dynamically generated webpage https://www.governmentjobs.com/careers/capecoral?page=1 I've used requests, scrapy, scrapy-splash but I simply get page source code and I don't get any job listing.

import requests
from bs4 import BeautifulSoup`
r = requests.get("https://www.governmentjobs.com/careers/capecoral?page=1")
soup = BeautifulSoup(r.content)
n_jobs = soup.select("#number-found-items")[0].text.strip()
print(n_jobs)

It always returns 0 jobs found

CodePudding user response:

you are trying to scrap data from a website which is using javascript, for that purpose you have to use selenium that will make sure page is fully rendered with data then send request to get page contents.

CodePudding user response:

As the url is dynamic that's why you can use selenium with bs4 to get the desired data. Here is an example.Please, just run the code.

import time
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

url = "https://www.governmentjobs.com/careers/capecoral?page=1"

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
time.sleep(8)
driver.get(url)
time.sleep(10)


soup = BeautifulSoup(driver.page_source, 'lxml')

for title in soup.select('.list-item h3 > a'):
    print(title.text)

Output:

Assistant City Attorney / City Attorney's Office
Business Applications Analyst II / Information Technology Services #6425
Contract Athletic Official / Athletics / Parks & Recreation #6237
Contract Background Investigation Specialist / Investigations / Police Dept.  #6514
Contract Beverage Cart/Waiter/Waitress / Parks and Recreation / Coral Oaks #6479
Contract Counselor / Youth Center / Parks & Recreation #6317
Contract Counselor/Instructor / Parks & Recreation / Special Populations #6339
Contract Custodial Worker / Lake Kennedy / Parks & Recreation #6525
Contract Custodial Worker /Parks & Recreation / Yacht Club #6312
Contract Golf Course Outside Operations / Parks & Recreation / Coral Oaks  #6535
    
     
  • Related