Home > other >  The crawler couldn't print the information
The crawler couldn't print the information

Time:09-18

Chung day in class for the teacher of the course, according to his code written reptile, but can't print any information, could you tell me who is this what went wrong?
# CrawUnivRankingA. Py
The import requests
The from bs4 import BeautifulSoup
The import bs4

Def getHTMLText (url) :
Try:
R=requests. Get (url, timeout=30)
R.r aise_for_status ()
R.e ncoding=of state Richard armitage pparent_encoding
Return r.t ext
Except:
Return ""

Def fillUnivList (ulist, HTML) :
Soup=BeautifulSoup (HTML, "HTML parser")
For the tr in soup. The find (' tbody.) children:
If isinstance (tr, bs4. Element. The Tag) :
TDS=tr (" td ")
Ulist. Append ([TDS [0], string, TDS [1]. The string, TDS [3]. The string])

Def printUnivList (ulist, num) :
Print (" {: ^ 10} {: ^ 6} \ \ t t {: ^ 10} ". The format (" rank ", "school name", "total"))
For I in range (num) :
U=ulist [I]
Print (" {: ^ 10} {: ^ 6} \ \ t t {: ^ 10} ". The format (u [0], u [1], u [2]))

Def the main () :
Uinfo=[]
Url='https://www.zuihaodaxue.cn/zuihaodaxuepaiming2016.html'
HTML=getHTMLText (url)
FillUnivList (uinfo, HTML)
PrintUnivList (20) uinfo, # 20 univs

CodePudding user response:

If __name__=="__main__" :
main()

CodePudding user response:

reference 1st floor weixin_45903952 response:
if __name__=="__main__" :
The main ()

Put the two words of the code of the bosses, go to, is ok

CodePudding user response:

reference 1st floor weixin_45903952 response:
if __name__=="__main__" :
The main ()

But the display name is not defined, is this excuse me?

CodePudding user response:

reference weixin_45100250 reply: 3/f
Quote: refer to 1st floor weixin_45903952 response:
if __name__=="__main__" :
The main ()

But the display name is not defined, is this excuse me?

Did you copy to go, do not itself, is the __

CodePudding user response:

Your URL can't open, 404

CodePudding user response:

reference 5 floor tsfy2003 reply:
your URL can't open, 404

http://www.zuihaodaxue.cn/zuihaodaxuepaiming2016.html
Link to the

CodePudding user response:

reference 4 floor weixin_45903952 response:
Quote: reference weixin_45100250 reply: 3/f

Quote: refer to 1st floor weixin_45903952 response:
if __name__=="__main__" :
The main ()

But the display name is not defined, is this excuse me?

Did you copy will do, and do not themselves, are less play _

Copy the running shows that the main () red, appear invalid character in the identifier

CodePudding user response:

refer to 6th floor weixin_45100250 response:
Quote: refer to the fifth floor tsfy2003 reply:
your URL can't open, 404

http://www.zuihaodaxue.cn/zuihaodaxuepaiming2016.html
Link to the


 import requests 
The from bs4 import BeautifulSoup

Url='http://www.zuihaodaxue.cn/zuihaodaxuepaiming2016.html'
Res=requests. Get (url)
Res. The encoding="utf-8"

Bs4=BeautifulSoup (res) text, '. The HTML parser)
The items=bs4. Find_all (' tbody, class_='hidden_zhpm')
For the item in the items:
Lists=item. Find_all (' tr)
For all_school lists in:
Num=all_school. Find_all (" td ") [0]
School=all_school. Find_all (" td ") [1]
Score=all_school. Find_all (" td ") [4]
Print (' ranking: + num. Text + + school. 'school name:' text + 'total score: + score. The text)
  • Related