Home > other >  Python crawl taobao comments
Python crawl taobao comments

Time:09-30


Because the project need to climb a product more than fifty thousand comments, but online comments can only look at page 99, a total of less than 2000, more than fifty thousand to how the data is climbed out ahhh

CodePudding user response:

Will continue to be passed in 99, according to the url climb to the top of the back of the page can have a try to the data, if not estimate data platform is the rest of the comments didn't put out!

CodePudding user response:

It is not released, can only hand points to 99, it is really no way

CodePudding user response:

reference procedures for building a smile monkey response:
will continue to be passed in 99, according to the url climb to the top of the back of the page can have a try to the data, if not estimated data platform is the rest of the comments didn't put out!

It is not released, can only hand points to 99, it is really no way

CodePudding user response:

reference 3 floor do you want to sweet reply:
Quote: 1/f, reference program monkey reply:
will continue to be passed in 99, according to the url climb to the top of the back of the page can have a try to the data, if not estimated data platform is the rest of the comments didn't put out!

It is not released, can only hand points to 99, that is really no way


You manually to page 99, and then look at the url is sure to send a page parameter, if passed, is 99, you send a 100101 can have a try to load the data, such as how to load as not to come out, that's no way out, then estimate is the background the early comments directly to the filtered out,

CodePudding user response:

How to climb to the building Lord review data? Take a cookie can't often climb?

CodePudding user response:

The crawler code can share first?

CodePudding user response:

Crawl of taobao comments and Tmall taobao marketplace interface, both to the commodity field is different also, this is the foundation of Tmall shop review data code
The import requests
The import json
Url="https://rate.tmall.com/list_detail_rate.htm"
The header={
"Cookie", "cna=EYnEFeatJWUCAbfhIw4Sd0GO; X=__ll % 26 _ato 3 d - 1% % 3 d0; HNG 7 CZH - CN=CN % % 7 ccny % 7 c156; Ucs 1=cookie14=UoTaHYecARKhrA % 3 d % 3 d; Uc3=vt3=F8dBy32hRyZzP % 2 ff7mzq % 3 d & amp; Lg2=U % 2 BGCWK % 2 f75gdr5q % 3 d % 3 d & amp; Nk2=1 dsn4fjjwtp04g % 3 d % 3 d & amp; Id2=UondHPobpDVKHQ % 3 d % 3 d; T=ad1fbf51ece233cf3cf73d97af1b6a71; Tracknick=% 5 5 cu6625 cu4f0f % % 5 cu7ea22013; Lid E4=% % BC 8 f % % E6 A22013 BA E7 A5 % % % % 98%; Uc4=401 up5i07xswkbopxft nk4=0% % 2 bwulaz8xipo & amp; Id4=40 uoe3ehly % 0% 2 fltwlmadbutfmfbbgphg; LGC=% 5 5 cu6625 cu4f0f % % 5 cu7ea22013; Enc=ieSqdE6T % 2 fa5hys % 2 fmkinh0mnufink5fm1zkc0431e % 2 bta9evjdmzx9gricy % 2 fi2hzyyntvfqt66jxyzslcaz0kxgg % 3 d % 3 d; _tb_token_=536 fb5e55481b; Cookie2=157 aab0a58189205dd5030a17d89ad52; _m_h5_tk=150 df19a222f0e9b600697737515f233_1565931936244; _m_h5_tk_enc=909 fba72db21ef8ca51c389f65d5446c; Otherx=e % 3 d1%26 p % 3 d * % 26 s % 3 d0%26 c % 3 d0%26 f % 3 d0%26 g % 3 d0%26 t % 3 d0; L=cBa4gFrRqYHNUtVvBOfiquI8a17O4IJ51sPzw4_G2ICP9B5DeMDOWZezto8kCnGVL6mpR3RhSKO4BYTKIPaTlZXRFJXn9MpO.; Isg=BI6ORhr9X6 - NrOuY33d_XmZFy2SQp1Ju1qe4XLjXJRHsGyp1IJ9IG0kdUwfSA0oh,
""Referer" : "https://detail.tmall.com/item.htm",
"The user-agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 UBrowser/6.2.4098.3 Safari/537.36 ",
}
Params={# will bring information
"ItemId" : "596285864342", id # goods
"SellerId" : "2616970884",
"CurrentPage", "2", # page
"Callback" : "jsonp2359,"
}
The req=requests. Get (url, params, headers=headers). The decode (' utf-8) [12: - 1); # decoding, and remove the STR in json conversion character (\ n \ rjsonp (... ));
Result=json. Loads (the req);
Print (result)


This is the tutorial and obtain taobao marketplace review data code https://blog.csdn.net/u011280778/article/details/104197803
  • Related