The import re # text matching
Import urllib. Request, urllib. Error # specified UIL
Import XLWT # document
The import # sqlite3 database
The import sys
Import the random
# import urllib. Request
# import urllib. Error
Def the main () :
Baseurl="TTPS://movie. Douban. CPM/top250? Start="
# crawl web page
The datalist=getdata (baseurl)
Savedata="https://bbs.csdn.net/topics//douban film Top250. XLS"
# save data
Savedata (savepath)
AskURL="HTTP://https://movie.douban.com/top250? Start=0 "
Def getdata (baseurl) :
The datalist=[]
Return datalist
Def askURL (url) : # for a specified url of the page content
The head={the user-agent: "Mozilla/5.0 (Windows NT 10.0; Win64. X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 Edg/87.0.664.66}
"Request=urllib. Request. The request (url, headers=head)
Try:
The response=urllib. Request. Urlopen (request)
HTML=response. The read (). The decode (" utf-8 ")
Print (HTML)
Except urllib. Error. URLError as e:
If hasattr (e, "code") :
Print (e, code)
If hasattr (e, "" reason") :
Print (e, a tiny)
Def savedata (savepath)
CodePudding user response:
Haven't posted an error