I need fiverr service delivery times but I could get just first package's(Basic) delivery time. How can I get second and third package's delivery time? Is there any chance I can get it without using Selenium?
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.fiverr.com/volkeins/provide-10x-dofollow-backlinks-from-amazon-da96-permanent")
# BEAUTIFULSOUP
soup = BeautifulSoup(response.text, 'lxml')
print(soup.find_all("b", class_ = "delivery"))
CodePudding user response:
The data that the url contain which is dynamic meaning data is generated by JavaScript and BeautifulSoup can't render javaSceipt.So, You need automation tool something like selenium with BeautifulSoup. Please just run the code.
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
url ="https://www.fiverr.com/volkeins/provide-10x-dofollow-backlinks-from-amazon-da96-permanent"
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
time.sleep(8)
driver.get(url)
time.sleep(10)
soup = BeautifulSoup(driver.page_source, 'lxml')
driver.close()
print(soup.find("b", class_ = "delivery").text)
Output:
7 Days Delivery
CodePudding user response:
Using Selenium to print the text 7 Days Delivery you can use either of the following locator strategies:
Using css_selector and
get_attribute("innerHTML")
:driver.get('https://www.fiverr.com/volkeins/provide-10x-dofollow-backlinks-from-amazon-da96-permanent') print(driver.find_element(By.CSS_SELECTOR, "b.delivery").get_attribute("innerHTML"))
Using xpath and text attribute:
driver.get('https://www.fiverr.com/volkeins/provide-10x-dofollow-backlinks-from-amazon-da96-permanent') print(driver.find_element(By.XPATH, "//b[@class='delivery']").text)
To extract the text 7 Days Delivery ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and text attribute:
driver.get('https://www.fiverr.com/volkeins/provide-10x-dofollow-backlinks-from-amazon-da96-permanent') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "b.delivery"))).get_attribute("innerHTML"))
Using XPATH and
get_attribute("innerHTML")
:driver.get('https://www.fiverr.com/volkeins/provide-10x-dofollow-backlinks-from-amazon-da96-permanent') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//b[@class='delivery']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
Console Output:
7 Days Delivery
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute()
methodGets the given attribute or property of the element.
text
attribute returnsThe text of the element.
- Difference between text and innerHTML using Selenium
CodePudding user response:
With requests.get('https://...').text
, you will receive the html content of the page. The problem is that most modern websites use client-side-rendering to built up the content for the page, so you will need javascript to render the page as your web browser does. You can use selenium to achieve this.