I failed to extract the first column "Name" from the website. Is there anyone who can help? The website address is: https://www.dianashippinginc.com/the-fleet/
'''
chromedriver_location = ""
driver = webdriver.Chrome(chromedriver_location)
driver.get('https://www.dianashippinginc.com/fleet-employment-table/')
cookie_address = '//*[@id="ccc-notify-accept"]/span'
name_address = '/html/body/div[3]/div/div/div[2]/table/tbody/tr[3]/td[2]/span'
driver.find_element_by_xpath(cookie_address).click()
driver.find_element_by_xpath(name_address)
'''
CodePudding user response:
import scrapy
class MySpider(scrapy.Spider):
name = 'myspider'
start_urls = [r'https://www.dianashippinginc.com/the-fleet/']
def parse(self, response):
names = response.xpath('//div[@]/text()').getall()
# Process the names list to be as you want (remove tab characters, ranking numbers etc.)
yield names