Home > other >  Unable to crawl taobao goods picture
Unable to crawl taobao goods picture

Time:04-20

According to the tutorial writing, still can't download the taobao commodity picture, what is wrong? Does anyone can tell me,

The import urllib. Request
The import re
Import the random

Keyname="dress"
Key=urllib. Request. Quote (keyname)
Uapools=[" Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; Win64. X64) AppleWebKit/537.36 (KHTML, like Gecko)
""Chrome/86.0.4240.111 Safari/537.36",
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon 2.0), "
]

Def ua (uapools) :
Thisua=the random choice (uapools)
Print (thisua)
Headers=(" the user-agent, "thisua)
Opener=urllib. Request. Build_opener ()
Opener. Addheaders=[headers]
For the installation of the global #
Urllib. Request. Install_opener (opener)


For I in range (1, 1) :
Url="https://s.taobao.com/search? Q="+ key +" & amp; S="+ STR (I - 1) * (44)
Ua (uapools)
The date=urllib. Request. Urlopen (url). The read (). The decode (" utf-8 ", "ignore")
Pat='" pic_url ":"//(. *?) "'
# I test here, the value of the length of pat, pat no problem,
Imglist=re.com running (pat). The.findall (date)
# I test here, the length of imglist is 0,

For j in range (0, len (imglist) :
Thisimg=imglist [j]
Thisimgurl="http://" + thisimg
Localfile="D: \ \ Program Files \ \ (x86) PyCharm Community Edition 2020.3.4 \ \ date \ \ page \ " \
Taobao \ \ "" + STR (I) + STR (j) +". JPG "
Urllib. Request. Urlretrieve (thisimgurl, filename=localfile)

CodePudding user response:

Really do not have a great god??
  • Related