In the frame of the scrapy, at first I use scrapy. FormRequest () method to send a post request, but receive less than data. The code is as follows:
The class FjggfwSpider (scrapy. Spiders) :
Name='FJGGFW'
Allowed_domains=[' FJGGFW. Gov. Cn]
Start_urls=[' https://www.fjggfw.gov.cn/Website/AjaxHandler/BuilderHandler.ashx ']
Def start_requests (self) :
Url=self. Start_urls [0]
Num=random. The random ()
Data_dict={
'OPtype' : 'GetListNew',
'pageNo:' 1 ',
'pageSize' : '10',
'proArea' : '1',
'category' : 'GCJS',
'announcementType', '5',
'ProType' : '1',
'XMLX' : '1',
'projectName' : ' ',
'TopTime' : 'the 00:00:00' 2019-06-09,
'EndTime' : 'the 23:59:59' 2019-09-07,
'RRR' : '% s' % num,}
Yield scrapy. FormRequest (url, formdata=https://bbs.csdn.net/topics/data_dict, callback=self. Parse)
Def parse (self, response) :
Print (' + + + + +, the response text)
Print ('====='len (response. The text))
Settings. Py request header part of the code is as follows:
DEFAULT_REQUEST_HEADERS={
'the USER_AGENT' : 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36 ',
'cookies' :' a long cookies'
'the content-type' : 'application/x - WWW - form - urlencoded; Charset=utf-8 ',
'Accept' : 'application/json, text/javascript, */*; Q=0.01 ',
}
Run as a result, don't get any data. I just changed a kind of method, using scrapy. Request post method, but still don't receive any data, the code is as follows:
The class FjggfwSpider (scrapy. Spiders) :
Name='FJGGFW'
Allowed_domains=[' FJGGFW. Gov. Cn]
Start_urls=[' https://www.fjggfw.gov.cn/Website/AjaxHandler/BuilderHandler.ashx ']
Def start_requests (self) :
Url=self. Start_urls [0]
Num=random. The random ()
Data_str=json. Dumps ({
'OPtype' : 'GetListNew',
'pageNo: 1,
'pageSize: 10,
'proArea: 1,
'category' : 'GCJS',
'announcementType: 5,
'ProType: 1,
'XMLX: 1,
'projectName' : ' ',
'TopTime' : 'the 00:00:00' 2019-06-09,
'EndTime' : 'the 23:59:59' 2019-09-07,
'RRR' : '% s' % num,
})
Yield scrapy. Request (url, body=data_str, method="POST", the callback=self. Parse)
Result in this kind of writing also not line, is still don't get any data, I wonder why, I used a new a test. The py files, with requests library: we have tried the post method and not quite normal can receive data, the code is as follows:
Url='https://www.fjggfw.gov.cn/Website/AjaxHandler/BuilderHandler.ashx'
Headers={
'cookies' :' nk5dz _qddac=4-3-1. 1. 9 v9kbn. K09bvah6; ASP.NET _SessionId=kpf3jdrnokalcn1xy4a2kxu1; __root_domain_v=. FJGGFW. Gov. Cn; _qddaz=QD. W0v57s. S4gyx6. K092itpm; Hm_lvt_94bfa5b89a33cebfead2f88d38657023=1567831712156831, 736; Hm_lvt_63d8823bd78e78665043c516ae5b1514=1567831712156831, 737; 4-1.1 nk5dz _qdda=; _qddab=4-9 v9kbn. K09bvah6; Hm_lpvt_63d8823bd78e78665043c516ae5b1514=1567847414; _qddamta_2852155767=4-0; Hm_lpvt_94bfa5b89a33cebfead2f88d38657023=1567848173; 94572 d3487f831178003076a3b5c57890569894bd829132e5d69b3004e04ecef7f03c1fca8a7951959f60d7142c265065e3794afadb5555b6d2d7abd126ac51ade1332fcda11a2a47d9d731df66d243644724a35e4f1fc20efde9cae6c66faa447bb15a10b7a665de098b86d4371d2e44f101d26cf33fa61878080558f4966b5 _qddagsx_02095bad0b='
}
Num=random. The random ()
Data={
https://bbs.csdn.net/topics/'OPtype' : 'GetListNew',
'pageNo: 1,
'pageSize: 10,
'proArea: 1,
'category' : 'GCJS',
'announcementType: 5,
'ProType: 1,
'XMLX: 1,
'projectName' : ' ',
'TopTime' : 'the 00:00:00' 2019-06-09,
'EndTime' : 'the 23:59:59' 2019-09-07,
'RRR' : '% s' % num,
}
The response=requests. Post (url, data=https://bbs.csdn.net/topics/data, headers=headers)
Print (response. The decode ())
Didn't think it can be normal receive data, then I wonder more, in the first two ways to send a post Request from a scrapy scrapy. Request () and scrapy FormRequest (), where the code I wrote the wrong? Why can't request data? See for a long time the source code is not clear, also can not find the answer by search engines. Can only hope that everyone's bosses, if you have to understand the knowledge of the eldest brother also give advice or comments, please, thank you thank you
CodePudding user response:
. DrunkYou give this to comment,
# allowed_domains=[' FJGGFW. Gov. Cn]
CodePudding user response: