Home > Enterprise >  How to extract attribute value from a tag in BeautifulSoup
How to extract attribute value from a tag in BeautifulSoup

Time:07-10

I am trying to extract the value of an attribute from a tag (in this case, TD). The code is as follows (the HTML document is loaded correctly; self.data contains string with HTML data, this method is part of a class):

def getLine (self):
    dat = BeautifulSoup(self.data, "html.parser")
    tags = dat.find_all("tr")
    for current in tags:
        line = current.findChildren("td", recursive=False)
        for currentLine in line:
            # print (currentLine)
            clase = currentLine["class"] # <-- PROBLEMATIC LINE
            if clase is not None and "result" in clase:
                valor = Line()
                valor.name = line.text

The error is in the line clase = currentLine["class"]. I just need to check the tag element has this attribute and do things in case it has the value "result".

File "C:\DataProgb\urlwrapper.py", line 43, in getLine
    clase = currentLine["class"] #Trying to extract attribute class
\AppData\Local\Programs\Python\Python39\lib\site-packages\bs4\element.py", line 1519, in __getitem__
    return self.attrs[key]
KeyError: 'class'

It should work, because it's just an element. I don't understand this error. Thanks.

CodePudding user response:

Main issue is that you try to access the attribute key directly, what will return a KeyError, if the attribute is not available:

currentLine["class"]

Instead use get() that will return in fact of a missing attribute None:

currentLine.get("class")

From the docs - get(key\[, default\]):

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.

  • Related