I am trying to scrape this website and I tried running scrapy shell in my cli and I can get xpath response up to //table[@class='table my-table']
this xpath but after that I cannot get any data as the response is empty array []
I don't feel the contents is hidden inside JavaScript I have missed some techniques or is my approach wrong with scrapy?
Here is my overall code for reference
class MarketDataSpider(scrapy.Spider):
name = "nepse_floorsheet"
def start_requests(self):
url = 'http://www.nepalstock.com/main/floorsheet/index/0/'
yield Request(url, callback=self.parse)
def parse(self, response):
for tr in response.xpath("//table[@class='table my-table']"):
print(tr.xpath("//tbody//tr[position()>2and position()<23]"))
CodePudding user response:
To search for an XPATH within an element, you need to put a dot in front of xpath expression, like below:
tr.xpath(".//tbody//tr[position()>2 and position()<23]")
Did not test it, but this is the correct way. Scrapy documentation: https://docs.scrapy.org/en/latest/
CodePudding user response:
Just remove tbody tag then it will generate ResultSet but the output is alwaya changed dynamically from the present static table data.
Example:
class MarketDataSpider(scrapy.Spider):
name = "nepse_floorsheet"
def start_requests(self):
url = 'http://www.nepalstock.com/main/floorsheet/index/1/'
yield scrapy.Request(url, callback=self.parse)
def parse(self, response):
for tr in response.xpath("//table[@class='table my-table']//tr[position()>2 and position()<23]"):
yield {
'Quantity':tr.xpath('.//td[6]/text()').get(),
'Rate':tr.xpath('.//td[7]/text()').get()
}