Home > Software engineering >  Connection error for requests and BeautifulSoup4. Last night my code worked and I haven't chang
Connection error for requests and BeautifulSoup4. Last night my code worked and I haven't chang

Time:10-04

I'm writing code that performs web scraping. I'm trying to get HTML code from the Cambridge dictionary website, but an error message pops up. I would really appreciate it if you can teach me the reason for the error and the solution to this problem.

Here is my code:

import requests
from bs4 import BeautifulSoup
    
    



def checkWord(word):
    url_top = "https://dictionary.cambridge.org/dictionary/english/"
    url = url_top   word

    headers = requests.utils.default_headers()

    headers.update(
        {
            'User-Agent': 'My User Agent 1.0',
        }       
    )

    html = requests.get(url, headers=headers).text 
    soup = BeautifulSoup(html, 'html.parser') 
    check = soup.find("title")
    boolean = check.string

    
    if boolean == "Cambridge English Dictionary: Meanings & Definitions":
        return False
    else:
        return True

word = "App"
checkWord(word)

However, error occured at html = requests.get(url, headers=headers).text

Error message is shown below--

Exception has occurred: ConnectionError
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

  File "<string>", line 3, in raise_from

During handling of the above exception, another exception occurred:

  File "<string>", line 3, in raise_from

During handling of the above exception, another exception occurred:

CodePudding user response:

Your code is worrking fine all the times. Most likely the problem is your local internet that's why it may be temporarily or check your internet connection

import requests
from bs4 import BeautifulSoup
    
def checkWord(word):
    url_top = "https://dictionary.cambridge.org/dictionary/english/"
    url = url_top   word

    headers = requests.utils.default_headers()

    headers.update(
        {
            'User-Agent': 'Mozilla/5.0',
        }       
    )

    html = requests.get(url, headers=headers).text 
    soup = BeautifulSoup(html, 'html.parser') 
    check = soup.find("title").text
    print(check)


word = "App"
checkWord(word)

Output:

APP | meaning, definition in Cambridge English Dictionary

CodePudding user response:

It looks like the remote host banned you. If you still can open the website from your computer with a web browser try to change user agent to something like this:

"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36"
  • Related