I created a program to scrap data from amazon but getting errors which i am unable to understand. I am using Xpath to locate classes and i tried to extract books names on a amazon page. I am searching amazon with a keyword hacking books and it successfully searches it but it does not give result after searching it. I tried following code
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time as t
import pandas as pd
driver = webdriver.Chrome(executable_path='chromedriver.exe')
wait = WebDriverWait(driver, 5)
url = "https://www.amazon.com"
driver.get(url)
keyword = "hacking books"
search_book = driver.find_element(By.ID,'twotabsearchtextbox')
search_book.send_keys(keyword)
search_button = driver.find_element(By.ID,'nav-search-submit-button')
search_button.click()
big_list = []
while True:
try:
items = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//a[@class =alink-normal s-underline-text s-underline-link-text s-link-style a-text-normal]')))
for i in items:
big_list.append((i.text, i.get_attribute('href')))
next_page_button = wait.until(EC.element_to_be_clickable((By.XPATH, '//span[@class=s-pagination-strip]//a[contains(text(), "Next")]')))
next_page_button.location_once_scrolled_into_view
t.sleep(10)
next_page_button.click()
print('clicked, going to next page')
t.sleep(10)
except TimeoutException:
print('all pages done')
break
df = pd.DataFrame(big_list, columns = ['Book', 'Url'])
print(df)
df.to_csv('hacking_books.csv')
driver.quit()
Can you help to find bug.
CodePudding user response:
Problem is simple you are not using double quotes in class names try
//a[@class ="alink-normal s-underline-text s-underline-link-text s-link-style a-text-normal"]'
And same for other XPath use double quotes like
CodePudding user response:
your Xpath is not a valid expression A valid xpath expression is like
Relative Path:'//tagname[@attribute=""]'
So you just have to use double quotes