Home > Mobile >  Problem with BeautifulSoup when retrieving image src attribute and comparing it
Problem with BeautifulSoup when retrieving image src attribute and comparing it

Time:08-17

I want to retrieve the list of games played in the FIDE archives (e.g. https://ratings.fide.com/view_source.phtml?code=272077). I manage very well to get all the columns, but to know which player was white or black, I must also get the image which is on the same line as the games (<img align="absbottom" border="0" src="/imga/clr_bl.gif"/> for blacks and <img align="absbottom" border="0" src="/imga/clr_wh.gif"/> for whites. Problem, when I try to set the player variable to 1 for white and 0 for black, my two if conditions don't work(I also tried with if in, it doesn't work either). Here is the code:

for row in rows:
    picture = row.find_all('img')
    print("loop")
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    cols.append(picture)
    data.append([ele for ele in cols if ele])
for row in data:
    if "ID" and "Name" in row:
        continue
    if 'Round' and 'Opp. name' in row:
        continue
    if "Game" in row:
        tempdata = {
            "ID" : row[0],
            "Fed" : row[2],
            "Rating" : row[3]
        }
        continue
    if '<img align="absbottom" border="0" src="/imga/clr_wh.gif"/>' == picture[0]:
        player = 1
        print("player is white")
    elif '<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>' == picture[0]:
        player = 0
        print("player is black")
    else:
        player = 1
        print(picture)
        print("an error occured")
    print(player)  

I get

[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured
1
[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured
1
[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured
1
[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured

and so on.

CodePudding user response:

Try to change the condition to:

...

if "_wh" in picture[0]["src"]:
    player = 1
    print("player is white")
elif "_bl" in picture[0]["src"]:
    player = 0
    print("player is black")

...

EDIT: Example to get player colors:

import requests
from bs4 import BeautifulSoup

url = "https://ratings.fide.com/view_source.phtml?code=272077"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

for row in soup.select("tr:has(img[align]):not(:has(table))"):
    color = "white" if "_wh" in row.img["src"] else "black"
    name = row.select_one("img   a")
    print("{:<10} {}".format(color, name.text if name else "N/A"))

Prints:

white      Tas, Ruzgar
black      Kartop, Metehan
white      Haznedar, Galip
black      Ugur, Cem
white      Yuzsever, Cenk
black      Kiziltas, Inanc Vefa
white      Yurtoglu, Osman Talha
black      Ozcan, Kadir Kutay
white      Acar, Cengiz Can
black      Yardimci, Ramazan

...
  • Related