# taobao commodity image crawler
The import urllib. Request
The import re
Import the random
Keyname="dress"
Key=urllib. Request. Quote (keyname)
Uapools=[
"Mozilla/5.0 (Windows NT 10.0; WOW64. Trident/7.0; The rv: 11.0) like Gecko,
""Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3314.0 Safari/537.36 SE 2. MetaSr X 1.0 ",
"Mozilla/5.0 (Windows; U; Windows NT 6.1; ) AppleWebKit/534.12 (KHTML, like Gecko) Maxthon 3.0/Safari/534.12 "
]
Def ua (uapools) :
Thisua=the random choice (uapools)
Print (thisua)
Headers=(" the user-agent, "thisua)
Opener=urllib. Request. Build_opener ()
Opener. Addheaders=[headers]
For the installation of the global #
Urllib. Request. Install_opener (opener)
For I in range (1101) :
Url="https://s.taobao.com/search? Q="+ key +" & amp; S="+ STR (I - 1) * (44)
Ua (uapools)
data=https://bbs.csdn.net/topics/urllib.request.urlopen (url). The read (). The decode (" utf-8 ", "ignore")
Pat='pic_url "://" (. *?) "'
Imglist=re.com from running (pat). The.findall (data)
Print (imglist)
For j in range (0, len (imglist) :
Thisimg=imglist [j]
Thisimgurl="http://" + thisimg
Lockfile="D:/Python practice/taobao pictures/" + STR (I) + STR (j) +". JPG "
Urllib. Request. Urlretrieve (thisimgurl, filename=localfile)
CodePudding user response:
Regular written wrongCodePudding user response:
You this data with the original data is different also,CodePudding user response:
There is something wrong with the regular expression, is what is inside a pair of single quotes has three double quotation marksCodePudding user response:
The regular expression is flawed, urllib library is not working, I usually use requestsCodePudding user response:
I also encountered the same problemCodePudding user response:
I also encountered the same problem