Home > other >  HttpClient4.3.5 scraping of the page back to 400
HttpClient4.3.5 scraping of the page back to 400

Time:01-27

I grab written in C request header information is such, can catch, return 200
GET/HTTP/1.1
Host: www.moa.gov.cn
The Referer: http://www.moa.gov.cn
The Accept - Encoding: gzip, deflate, SDCH, br
Cache-control: no - Cache, Max - age=0
Pragma: no cache -
Connection: close
Upgrade: 1
The user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10 _12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36


Written in Java request header information is such, can't catch, return 400
GET http://www.moa.gov.cn HTTP/1.1
The Referer: http://www.moa.gov.cn
The Accept - Encoding: gzip, deflate, SDCH, br
Cache-control: no - Cache, Max - age=0
Pragma: no cache -
Connection: close
Upgrade: 1
The user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10 _12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36


Java code:
Request=new HttpGet (url);
Request. SetHeader (HttpHeaders. The HOST, "www.moa.gov.cn");
Request. SetHeader (HttpHeaders REFERER, "http://www.moa.gov.cn");
Request. SetHeader (HttpHeaders ACCEPT_ENCODING, "gzip, deflate, SDCH, br");
Request. SetHeader (HttpHeaders CACHE_CONTROL, "no - cache, Max - age=0");
Request. SetHeader (HttpHeaders PRAGMA, "no - cache");
Request. SetHeader (HttpHeaders. CONNECTION, "close");
Request. SetHeader (HttpHeaders. UPGRADE, "1");
Request. SetHeader (HttpHeaders USER_AGENT, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10 _12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36 ");
The response=httpClient. Execute (request);
Daniel to reassure genuflect is begged

CodePudding user response:

Summons a great god

CodePudding user response:

Get rid of this request. SetHeader (HttpHeaders. The HOST, "www.moa.gov.cn"); , in the most httpproxy complete url
  • Related