How to click on the Read more link from the first review within tripadvisor using Selenium and Pytho-CodePudding

I am using selector gadget to get the xpath from the "read more" button from the first review on this website

This is the xpath it gave:

//*[contains(concat( " ", @class, " " ), concat( " ", "Z", " " ))]

Here is the first part of the code I am using:

import selenium
import csv #This package lets us save data to a csv file
from selenium import webdriver #The Selenium package we'll need
import time #This package lets us pause execution for a bit
from selenium.webdriver.common.by import By

path_to_file = "/Users/user/Desktop/HotelReviews.csv"

pages_to_scrape = 3

url = "https://www.tripadvisor.com/Hotel_Review-g60982-d209422-Reviews-Hilton_Waikiki_Beach-Honolulu_Oahu_Hawaii.html"

# open the file to save the review
csvFile = open(path_to_file, 'a', encoding="utf-8")
csvWriter = csv.writer(csvFile)

for i in range(0, pages_to_scrape):
    
    driver = webdriver.Chrome()
    driver.get("url")
    # give the DOM time to load
    time.sleep(2) 
    driver.find_element_by_xpath("//*[contains(concat( " ", @class, " " ), 
    concat( " ", "Z", " " ))], 'Read more')]").click()

This is the error I get:

File "/var/folders/6c/jpl964752rv_72zjclrp_8ym0000gn/T/ipykernel_24978/2812702568.py", line 8
    driver.find_element_by_xpath("//*[contains(concat( " ", @class, " " ), concat( " ", "Z", " " ))], 'Read more')]").click()
                                                                                         ^
SyntaxError: invalid syntax

Looks like it's the quotation marks that seems to be the issue.

So I followed this advice. I tried making the code a variable, but it spit out the same error. I tried removing the extra quotes, same error. I tried removing the space between the quotes, same error.

I tried a different xpath, one for the whole review //*[contains(concat( " ", @class, " " ), concat( " ", "F1", " " ))] Same error.

Then I tried adjusting the quotation marks on the first xpath

driver.find_element_by_xpath("//*[contains(concat( " ", @class, " " ), 
    concat( " ", "Z", " " ))]", "Read more")]).click()

results to the same error.

CodePudding user response：

To click() on the Read more link from the first review within tripadvisor website you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following locator strategy:

Using XPATH:

driver.get('https://www.tripadvisor.com/Hotel_Review-g60982-d209422-Reviews-Hilton_Waikiki_Beach-Honolulu_Oahu_Hawaii.html')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@placeholder='Search reviews']//following::div[@data-test-target='HR_CC_CARD']//span[text()='Read more']"))).click()

Note: You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Browser Snapshot:

Read less

CodePudding user response：

The basic problem is that while, for example a[x="3"] is a valid XPath expression, you can't put this in a Python string literal as "a[x="3"]" without escaping the quotes. I'm not a Python user but in most languages you would write "a[x=\"3\"]"; alternatively in XPath single and double quotes can be used interchangeably so you could write "a[x='3']"