Home > database >  I am not able to find relevent data using soup.findAll() while doing webscraping using beautifulsoup
I am not able to find relevent data using soup.findAll() while doing webscraping using beautifulsoup

Time:10-18

I am trying to scrape data(names,ages,teams) from this website-https://sofifa.com/players?offset=0. While I was trying to find the relevent data using soup.findAll(), I am getting an empty list.

    import pandas as pd
    import re
    import requests
    from bs4 import BeautifulSoup

    k=[]
    url="https://sofifa.com/players?offset=0"
    resp=requests.get(url)
    soup=BeautifulSoup(resp.content,'lxml')
    for omk in soup.find_all('><div class="bp3-text-overflow-ellipsis">'):
      k.append(str(omk))
    print(k)

I read some answers which had mentioned about tags and class but I don't know about these are.

CodePudding user response:

According to your question, here is an example of working solution:

Code:

import pandas as pd
import re
import requests
from bs4 import BeautifulSoup

k = []
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36'}

url = "https://sofifa.com/players?offset=0"
resp = requests.get(url, headers = headers)
soup = BeautifulSoup(resp.content, 'lxml')
for omk in soup.select('table.table.table-hover.persist-area tbody tr'):
    name = omk.select_one('td.col-name a:nth-child(1) div').get_text(strip=True)
    print(name)

Output:

M. Sarr
É. Mendy
P. Daka
F. Wirtz
J. Timber        
C. De Ketelaere  
Cristiano Ronaldo
D. Maldini       
J. Bellingham    
Lucas Paquetá    
Gavi
Antony
A. Spörle        
K. Adeyemi       
E. Haaland       
D. Kamada     
M. Salah      
N. Madueke    
A. Tchouaméni 
M. Greenwood  
M. Lacroix    
R. Gravenberch
Pedri
J. Gvardiol   
N. Lang
Raphinha
A. Hložek
J. Musiala
F. Chiesa
L. Messi
B. Brereton Díaz
R. Cherki
D. Vlahović
Ansu Fati
Pedro Benito
G. Raspadori
Yeremy Pino
Y. Tielemans
K. Mbappé
E. Camavinga
D. Scarlett
A. Bastoni
J. Sancho
T. Hernández
A. Davies
J. Koundé
A. Saint-Maximin
H. Elliott
S. Tonali
A. Broja
A. Isak
M. Vandevoordt
P. Foden
F. Kessié
J. Doku
E. Tapsoba
K. Mitoma
Luiz Felipe
Nuno Mendes
S. Dest

CodePudding user response:

There are a couple issues with your code snippet.

The first is that you need to specify an HTML parser when instantiating your BeautifulSoup instance:

soup=BeautifulSoup(resp.content,'html.parser')

Then, when searching for a div element with a class of bp3-text-overflow-ellipsis, the proper syntax is the following:

soup.find_all("div", class_="bp3-text-overflow-ellipsis")

Here is the documentation related to find_all: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all

  • Related