Home > OS >  scrapy.Request() not working with response.url
scrapy.Request() not working with response.url

Time:07-01

I am building a spider to crawl different tabs in a page.

There are cases where I need to extract an URL to go to the next page:

url = i.css('a').attrib['href'] 
yield response.follow(url=url, callback=self.parse_menu)

And there are cases where I dont need to go to a different page, but still want to go to the next step in the pipeline (parse_menu), so I do something like this:

yield response.follow(url=response.url,callback=self.parse_menu)

The first scenario works well, but in the second scenario parse_menu never gets called.

I think there is something I am missing in how the request and callback work maybe.

Thanks in advance!

CodePudding user response:

Try:

url = i.css('a::attr(href)').get()

CodePudding user response:

I am not sure if I understand you well, but I think you are sending the same request twice, so you need to set dont_filter to True.

yield response.follow(url=response.url,callback=self.parse_menu,dont_filter=True)
  • Related