I am trying to simulate an ajax request with scrapy FormRequest to get the next page on this website https://www.the-academy.nl/trainingen. My headers look like this
headers = {
'path': 'https://www.the-academy.nl/Page?$$ajaxid=view:_id1:_id2:_id3:_id4:_id5:6:_id114:_id116:tblView',
'authority': 'www.the-academy.nl',
'accept-encoding': 'gzip, deflate, br',
'content-length': '1225',
'content-type': 'multipart/form-data'
}
and formdata like this
formdata = {
'$$viewid': '!1rjej6ewgse3x0h6r86gfzlst!',
'$$xspsubmitid': 'view:_id1:_id2:_id3:_id4:_id5:6:_id114:_id116:viewPager__Group__lnk__1',
'$$xspexecid': 'view:_id1:_id2:_id3:_id4:_id5:6:_id114:_id116:viewPager',
'$$xspsubmitvalue':'',
'$$xspsubmitscroll': '0|1272',
}
and I am getting a response, but its the 404 page. Thank you in advance)
CodePudding user response:
I use
java
as search term. I select only the form data those have the key-value pairs.Don't inject
'content-length'
headerAdd method:"POST"
Call
FormRequest.from_response
Below is an example of 200 response status
Script:
from scrapy.crawler import CrawlerProcess
import scrapy
class AspSpider(scrapy.Spider):
name = 'asp'
def start_requests(self):
yield scrapy.FormRequest(
url='https://www.the-academy.nl/zoekresultatenpagina?text=java',
formdata= {
'view:_id1:_id2:_id3:_id4:_id5:2:_id86:_id88:query': "",
'view:_id1:_id2:_id3:_id4:_id5:3:_id94:_id96:query': "",
'$viewid': '!eaie1cfxpuckx0dbjrxsxrw60!',
'$$xspsubmitid': 'view:_id1:_id2:_id3:_id199:_id200:0:_id201:_id203:viewPager__Next',
'$$xspexecid': 'view:_id1:_id2:_id3:_id199:_id200:0:_id201:_id203:viewPager',
'$$xspsubmitscroll': '0|1500',
'view:_id1': 'view:_id1',
'$$xspsubmitvalue': ""
},
callback=self.parse_item,
headers={
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'content-type': 'multipart/form-data; boundary=----WebKitFormBoundary2aCYMIdAcbwx4FjO',
'referer': 'https://www.the-academy.nl/zoekresultatenpagina?text=java'
},
method='POST'
)
def parse_item(self,response):
pass
if __name__ == "__main__":
process =CrawlerProcess(AspSpider)
process.crawl()
process.start()
Output:
DEBUG: Crawled (200) <POST https://www3.hkexnews.hk/sdw/search/searchsdw.aspx> (referer: https://www.the-academy.nl/zoekresultatenpagina?text=java)