Home > Software engineering >  How to grab URL in "View Deal" and price for deal from kayak.com using BeautifulSoup
How to grab URL in "View Deal" and price for deal from kayak.com using BeautifulSoup

Time:02-01

I have a list of Kayak URLs and I'd like to grap the price and link in "View Deal" for the "Best" and "Cheapest" HTML cards, essentially the first two results since I've already sorted the results in the URLs (enter image description here

CodePudding user response:

There are 2 problems here with locator XPath:

  1. The a element class name is not booking-link, but booking-link , with trailing space.
  2. Your locator matching duplicating irrelevant (invisible) elements.
    The following locator works:
"//div[@class='above-button']//a[contains(@class,'booking-link')]/span[@class='price option-text']"

So, the relevant code line could be:

xp_prices = "//div[@class='above-button']//a[contains(@class,'booking-link')]/span[@class='price option-text']"

CodePudding user response:

To extract the prices from View Deal for the Best and Cheapest section within the website you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:

  • From the Best section:

    driver.get("https://www.kayak.com/flights/AMS-WMI,nearby/2023-02-15/WMI-SOF,nearby/2023-02-18/SOF-BEG,nearby/2023-02-20/BEG-MIL,nearby/2023-02-23/MIL-AMS,nearby/2023-02-25/?sort=bestflight_a")
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Best']//following::div[contains(@class, 'bottom-booking')]//a//div[contains(@class, 'price-text')]"))).text)
    
  • Console output:

    $807
    
  • From the Cheapest section:

    driver.get("https://www.kayak.com/flights/AMS-WMI,nearby/2023-02-15/WMI-SOF,nearby/2023-02-18/SOF-BEG,nearby/2023-02-20/BEG-MIL,nearby/2023-02-23/MIL-AMS,nearby/2023-02-25/?sort=bestflight_a")
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Cheapest']//following::div[contains(@class, 'bottom-booking')]//a//div[contains(@class, 'price-text')]"))).text)
    
  • Console output:

    $410
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Related