from bs4 import BeautifulSoup
import requests
link = requests.get("https://www.upwork.com/nx/jobs/search/?q=web scraping&sort=recency")
source = link.content
soup = BeautifulSoup(source, "lxml")
title = soup.find_all("h4", {"class": "my-0 p-sm-right job-tile-title"})
print(title)
i am trying to scrap the job titles but the problem is that i get an empty list but in other websites it work just fine
help me please
CodePudding user response:
You got an empty list because this data loads from a different request. You can see it if opens the console in your browser, network tab
CodePudding user response:
The url is entirely dynamic and bs4, requests module can't render JavaScript. But can do that using selenium with bs4
import time
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get('https://www.upwork.com/nx/jobs/search/?q=web scraping&sort=recency')
#driver.maximize_window()
time.sleep(5)
soup = BeautifulSoup(driver.page_source, 'lxml')
for title in soup.find_all("h4", {"class": "my-0 p-sm-right job-tile-title"}):
print(title.get_text())
Output:
Data collection from internet
Compile the contact list | Lead Generation(Contact Names and E-Mail Addresses)
Python project development for real estate management
I need someone to scrape a list of 800 LinkedIn profiles
Webscraper using Python
Software_engineer
Web Interface with integrarted ability of parsing PDF files into database
Web scrapping contact details using python
Shopping mall crawling ... python mysql