Home > Back-end >  How to extract the price of the book from website using Selenium Python
How to extract the price of the book from website using Selenium Python

Time:04-07

I am trying to get the price of this book from this link (I am using google colab for the project)

link of the webpage

This is the code I have made:

import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',options=options)
wd.get("https://www.amazon.fr/dp/000101742X")
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

product_title = wd.find_element(By.CLASS_NAME, 'a-size-extra-large')

print(product_title.text)

product_image_url = wd.find_element(By.ID, 'imgBlkFront')

print(product_image_url.get_attribute('src'))

product_price = wd.find_element(By.CLASS_NAME, 'a-size-base a-color-price a-color-price')

print(product_price.text)

When I run the code, This is giving me an error

NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".a-size-base a-color-price a-color-price"}
  (Session info: headless chrome=99.0.4844.84)
Stacktrace:
#0 0x561c75bf5b63 <unknown>
#1 0x561c758ebc93 <unknown>
#2 0x561c75921ba0 <unknown>
#3 0x561c75921dc1 <unknown>
#4 0x561c75956267 <unknown>
#5 0x561c7593f33d <unknown>
#6 0x561c75953fac <unknown>
#7 0x561c7593f683 <unknown>
#8 0x561c75915c7c <unknown>
#9 0x561c75917145 <unknown>
#10 0x561c75c19fe0 <unknown>
#11 0x561c75c2b17f <unknown>
#12 0x561c75c2af19 <unknown>
#13 0x561c75c2b6e2 <unknown>
#14 0x561c75c643cb <unknown>
#15 0x561c75c2b941 <unknown>
#16 0x561c75c0ed13 <unknown>
#17 0x561c75c35098 <unknown>
#18 0x561c75c3522a <unknown>
#19 0x561c75c4e711 <unknown>
#20 0x7f37b1a316db <unknown>

I searched about it and I also tried some other methods

wd.implicitly_wait(20) 
product_price = wd.find_element(By.CLASS_NAME, 'a-size-base a-color-price a-color-price')

print(product_price.text)
###########################################################
try:
    product_price = WebDriverWait(wd, 20).until(
        EC.presence_of_element_located((By.CLASS_NAME, 'a-size-base a-color-price a-color-price'))
    ) # wd.find_element(By.CLASS_NAME, 'a-size-base a-color-price a-color-price')
    print(product_price.text)

finally:
    wd.quit()
#################################################

button = wd.find_element(By.CLASS_NAME, 'a-button a-button-selected a-spacing-mini a-button-toggle format')
button.click()
product_price = wd.find_element(By.CLASS_NAME, 'a-size-base a-color-price a-color-price')

print(product_price.text)

All of these methods gave me errors. Can someone help? Thanks.

CodePudding user response:

To print the price text i.e. 23,72 you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using XPATH and text attribute:

    driver.get("https://www.amazon.fr/dp/000101742X")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@name='accept']"))).click()
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[text()='Partition']//following::span[1]"))).text.split(" ")[3])
    driver.quit()
    
  • Console Output:

    23,72
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

  • Related