Home > other >  Want to climb the news is always wrong, in python python small white, asked the great spirit
Want to climb the news is always wrong, in python python small white, asked the great spirit

Time:09-29

# coding: utf-8
# the introduction of related modules
The import requests
The from bs4 import BeautifulSoup
The from requests_html import HTMLSession
Url='https://gy.house.ifeng.com//news'
Wbdata=https://bbs.csdn.net/topics/requests.get (url). The text
# to get to the text parsing
Soup=BeautifulSoup (wbdata, 'LXML)
# from parsing the file through the select selector specified elements, return a list
News_titles=soup. Select (' body & gt; Div. W1180. Mb30 & gt; Div. Content. clearfix & gt; Div. The newsList. Clearfix. Fl & gt; Div. NewsDetail & gt; A ')
# to return to traverse the list of
For n in news_titles:
# to extract the title and link information
Title=n.g et_text ()
The link=n.g et (" href ")
Date={' title: "". Join (title. The split ()), 'links: link}
Date1={" ". Join (title. The split ())}
The session=HTMLSession ()
R=session. Get (date1)
Title1={r.h. the TML. Find (' body & gt; Div. W1180. Mb30 & gt; Div. Content. clearfix & gt; Div. The article - content. fl & gt; Div. Article & gt; Div. The title ', first=True)}
Context1=r.h. the TML. Find (' body & gt; Div. W1180. Mb30 & gt; Div. Content. clearfix & gt; Div. The article - content. fl & gt; Div. Article & gt; Div. The content - info> P ', the first=True)
Print (title1. Text)
Print (context1. Text)
use this code in https://gy.house.ifeng.com//news news articles is always wrong, python is a small white, the great god teach
Error code: "C: \ Program Files \ Python38 \ python exe" C:/Users/sikezx - all/PycharmProjects/PythonTest/Test1. Py
Traceback (the most recent call last) :
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ requests \ models py", line 379, in prepare_url
Scheme, the auth, host, port, path, query, fragments=parse_url (url)
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ urllib3 \ util \ url py", line 392, parse_url in
Return the six raise_from (LocationParseError (source_url), None)
The File "& lt; String>" , line 3, in raise_from
Urllib3. Exceptions. LocationParseError: Failed to parse: {' in the first three quarters of the national total more than 1.78 trillion yuan in new tax cuts JiangFei on Oct. 30, at a news conference, the state administration of taxation, is introduced in the first three quarters of this year the tax authorities to carry out the tax cut JiangFei, organization tax revenues, deepening the reform of "the pipes serve", optimize tax business environment, the People's Daily joint role of science and technology innovation '2019-11-01}

During handling of the above exception, another exception occurred:

Traceback (the most recent call last) :
File "C:/Users/sikezx - all/PycharmProjects/PythonTest/Test1. Py", line 71, in & lt; module>
R=session. Get (date1)
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ requests \ sessions py", line 546, in the get
Return the self. The request (' GET 'url, * * kwargs)
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ requests \ sessions py", line 519, in the request
Prep=self. Prepare_request (the req)
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ requests \ sessions py", line 452, in prepare_request
P.p repare (
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ requests \ models py", line 313, prepare in
Self. Prepare_url (url, params)
The File "C: \ Program Files \ Python38 \ lib \ site - packages \ requests \ models py", line 381, in prepare_url
Raise InvalidURL (* e.a RGS)
Requests. Exceptions. InvalidURL: Failed to parse: {' in the first three quarters of the national total more than 1.78 trillion yuan in new tax cuts JiangFei on Oct. 30, at a news conference, the state administration of taxation, is introduced in the first three quarters of this year the tax authorities to carry out the tax cut JiangFei, organization tax revenues, deepening the reform of "the pipes serve", optimize tax business environment, the People's Daily joint role of science and technology innovation '2019-11-01}

The Process finished with exit code 1

CodePudding user response:

This problem is to parse the inside why only you keys have no value? Normally should not be {" sad ":" dsa "}=& gt; XXX, XXX? Sad=dsa, you here is less by half, then in the news and no place to use parameters?