Home > Software engineering >  InvalidSchema: No connection adapters were found for "link"?
InvalidSchema: No connection adapters were found for "link"?

Time:09-01

I have a dataset with multiple links and I'm trying to get the text of all the links using the code below, but I'm getting a error message "InvalidSchema: No connection adapters were found for "'https://en.wikipedia.org/wiki/Wagner_Group'".

Dataset:

   links
   'https://en.wikipedia.org/wiki/Wagner_Group'
   'https://en.wikipedia.org/wiki/Vladimir_Putin'
   'https://en.wikipedia.org/wiki/Islam_in_Russia'

The code I'm using to web-scrape is:

def get_data(url): 
    page = requests.get(url)
    soup = BeautifulSoup(page.content,'html.parser')
    text = ""
    for paragraph in soup.find_all('p'):
        text  = paragraph.text
    return(text)

#works fine
url = 'https://en.wikipedia.org/wiki/M142_HIMARS'
get_data(url)

#Doesn't work

df['links'].apply(get_data)
Error: InvalidSchema: No connection adapters were found for "'https://en.wikipedia.org/wiki/Wagner_Group'"

Thank you in advance

#It works just fine when I apply it to a single url but it doens't work when I apply it to a dataframe.

CodePudding user response:

df['links'].apply(get_data) is not compatible with requests and bs4. You can try one of the right ways as follows:

Example:

import requests
from bs4 import BeautifulSoup
import pandas as pd
links =[
    'https://en.wikipedia.org/wiki/Wagner_Group',
    'https://en.wikipedia.org/wiki/Vladimir_Putin',
    'https://en.wikipedia.org/wiki/Islam_in_Russia']
  
data = []
for url in links:
    req = requests.get(url)
    soup = BeautifulSoup(req.text,'lxml')
    
    for pra in soup.select('div[] > table~p'):
        paragraph = pra.get_text(strip=True)

        data.append({
            'paragraph':paragraph
            })
#print(data)
df = pd.DataFrame(data)
print(df)

Output:

                            paragraph
0    TheWagner Group(Russian:Группа Вагнера,romaniz...
1    The group came to global prominence during the...
2    Because it often operates in support of Russia...
3    The Wagner Group first appeared in Ukraine in ...
4    The Wagner Group itself was first active in 20...
..                                                 ...
440  A record 18,000 Russian Muslim pilgrims from a...
441  For centuries, theTatarsconstituted the only M...
442  A survey published in 2019 by thePew Research ...
443         Percentage of Muslims in Russia by region:
444  According to the 2010 Russian census, Moscow h...

[445 rows x 1 columns]
  • Related