The source code is as follows:
The import requests
The from requests. Exceptions import RequestException
The import re
Headers={' the user-agent ':' Mozilla/5.0 (Windows NT 10.0; Win64. X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 '}
Def get_one_page (url) :
Try:
The response=requests. Get (url, headers=headers)
If the response. Status_code==200:
Return the response. The text
Return None
Except RequestException:
Return None
Def parse_one_page (HTML) :
The pattern=re.com running (' & lt; Dd> . *? Board - index. *?> (\ d +) & lt;/i> . *? Data - SRC="https://bbs.csdn.net/topics/(. *?) ". *? The name "& gt; + '. *?> (. *?) . *? Star "& gt; (. *?) . *? Releasetime & gt; "" (. *?) '
+ '. *? The integer "& gt; (. *?) . *? Fraction "& gt; (. *?) . *? 're. S)
The items=re. The.findall (pattern, HTML)
Print (items)
Def the main () :
Url="https://maoyan.com/board/4?"
HTML=get_one_page (url)
Parse_one_page (HTML)
If __name__=="__main__ ':
The main ()