Home > Net >  Using xpath gives empty output
Using xpath gives empty output

Time:01-24

I want to get address but they provide me empty what I am doing wrong in the XPath..... these is the page link enter image description here

Code trials:

import scrapy
from scrapy import Selector
from scrapy_selenium import SeleniumRequest
from scrapy.http import Request


class TestSpider(scrapy.Spider):
    name = 'test'

    
    
    def start_requests(self):
            yield SeleniumRequest(
                url ="https://www.findtruckservice.com/search/?city=Florida, CO&mainCat=1&subCat=Truck Repair&lat=37.0731&lon=-106.247&cat_field=Mobile Repair - Truck Repair",
                wait_time = 3,
                screenshot = True,
                callback = self.parse,
                dont_filter = True
                )
    
    def parse(self, response):
            books = response.xpath("//h3//a//@href").extract()
            for book in books:
                url = response.urljoin(book)
                yield Request(url, callback=self.parse_book)
            
                    
    def parse_book(self, response):
            address=response.xpath("//div[1][@class='threecol align_left card']//div//text()").get()
            yield{
                'address':address
            }

CodePudding user response:

To print the desired text from the website you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using XPATH and text attribute:

    driver.get("https://www.findtruckservice.com/page/cummins-sales-and-service-farmington-nm-430653")
    print(WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.XPATH, "//h4[@class='sec-title' and text()='CONTACT']//following::div[@class='container']"))).text)
    
  • Using XPATH and get_attribute("textContent"):

    driver.get("https://www.findtruckservice.com/page/cummins-sales-and-service-farmington-nm-430653")
    print(WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.XPATH, "//h4[@class='sec-title' and text()='CONTACT']//following::div[@class='container']"))).get_attribute("textContent"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    Cummins Sales and Service
    1101 N Troy King Rd
    Farmington, NM
    505-327-7331 (primary)
    505-326-2948 (fax)
    

References

Link to useful documentation:

CodePudding user response:

Try the following:

[...]

address = ' '.join([x.strip() for x in response.xpath("//div[@class='threecol align_left card'][1]/div[@class='container']/text()").extract()])
  • Related