Home > Net >  How to scrape all the prices using Selenium and Python
How to scrape all the prices using Selenium and Python

Time:01-24

I'm trying to get ticket prices from Viagogo with not luck. The scrip seems quite simple, and works with other websites but not for Viagogo. I have no issue in getting the text in the title from here.

It always return me an empty result( i.e. []). Can anyone help?

Code trials:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import pandas as pd

s = Service("[]/Downloads/chromedriver/chromedriver.exe")
driver = webdriver.Chrome(service=s)
driver.get('https://www.viagogo.com/Concert-Tickets/Country-and-Folk/Taylor-Swift-Tickets/E-151214704')
price = driver.find_elements(by=By.XPATH, value('//div[@id="clientgridtable"]/div[2]/div/div/div[3]/div/div[1]/span[@]'))
Print(price)
[]

I'am expecting the 7 prices defined in the right hand side of the website situated just above "per ticket"

CodePudding user response:

There is an error in the definition of price, it should be value='...' instead of value('...'). Moreover, you should define it using a wait command so that the driver waits for the prices to be visible on the page.

price = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "...")))

Notice that this command needs the following imports

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

CodePudding user response:

To extract all the 7 prices defined in the right hand side within the website you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:

  • Using CSS_SELECTOR and text attribute:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.f-list__cell-pricing-ticketstyle > div.w100 > span")))])
    
  • Using XPATH and get_attribute("innerHTML"):

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='f-list__cell-pricing-ticketstyle']/div[@class='w100 ']/span")))])
    
  • Console Output:

    ['Rs.25,509', 'Rs.25,873', 'Rs.27,403', 'Rs.28,788', 'Rs.72,809', 'Rs.65,593', 'Rs.29,153', 'Rs.29,153']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Related