Home > other >  My cat's eye small white climbed out top100 problem, please give directions!
My cat's eye small white climbed out top100 problem, please give directions!

Time:09-30

Just began to learn the crawler, online to see another climb took the cat's eye top100 instance, follow suit, but the result of the crawl is' [] ', looked at the web pages returned, not done in the source code, have mentioned validation , could you tell me how to solve it

The source code is as follows:
 
The import requests
The from requests. Exceptions import RequestException
The import re

Headers={' the user-agent ':' Mozilla/5.0 (Windows NT 10.0; Win64. X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 '}

Def get_one_page (url) :
Try:
The response=requests. Get (url, headers=headers)
If the response. Status_code==200:
Return the response. The text
Return None
Except RequestException:
Return None

Def parse_one_page (HTML) :
The pattern=re.com running (' & lt; Dd> . *? Board - index. *?> (\ d +) & lt;/i> . *? Data - SRC="https://bbs.csdn.net/topics/(. *?) ". *? The name "& gt; + '. *?> (. *?) . *? Star "& gt; (. *?)

. *? Releasetime & gt; "" (. *?)

'
+ '. *? The integer "& gt; (. *?) . *? Fraction "& gt; (. *?) . *? 're. S)
The items=re. The.findall (pattern, HTML)
Print (items)

Def the main () :
Url="https://maoyan.com/board/4?"
HTML=get_one_page (url)
Parse_one_page (HTML)

If __name__=="__main__ ':
The main ()
  • Related