Home > Software design >  How beautifulsoup objects able to have a tag as an attribute?
How beautifulsoup objects able to have a tag as an attribute?

Time:12-07

Inorder to extract a tag, you need to use the tag as an attribute to the Tag/BeautifulSoup object, e.g. To extract the <head> tag, I need to do this soupobject.head

I'm still beginner in programming and python but from my understanding and quick google search, object attributes are variables belonging to that objects. I mean I can write a script that have a variable named p and have a condition that when my script run, if it find a <p> tag, it will then parse any relevant data from it and then assign it to the p variable I made, but to write a script that itself will "define" a variable and name it according to html tag name that I don't know how.

I hope I explaining it enough. I tried to understand the beautifulsoup source code but honestly I still having trouble understanding most of it.

My only assumption/theory on how it able to that, is by creating a string format of a python code then import that, I don't know if that possible

CodePudding user response:

Have a look at data model class customization via special methods and particularly at customizing attribute access via __getattr__() and __getattribute__() magic methods

In this particular case (bs4), you can have a look at bs4 source code for Tag class, where they define Tag.__getattr__() magic method. Note that BeautifulSoup class inherits from Tag

Also not that soup.head is not the only way to access head tag. you can do soup.find('head') - that is exactly what they do in Tag.__getattr__().

To expand with an example

class Foo:
    def __init__(self):
        self.spam = 'spam'

    def __getattr__(self, name):
        return f'Attribute "{name}" returned from __getattr__'

foo = Foo()
print(foo.spam)
print(foo.eggs)

output:

spam
Attribute "eggs" returned from __getattr__

CodePudding user response:

In general, it is not considered a good practice to have varaible variable names. Some languages even make it impossible to do so. In order to achieve the same thing, you can use a dictionary object which can have variable key-names and variable values.

my_dict = {'key_1': 'value 1'}
print(my_dict['key_1'])
# out: 'value 1'

my_dict['some_key'] = 'another value'
# now your dictionary looks like this: 
# {'key_1': 'value 1', 'some_key': 'another value'}
print(my_dict['some_key'])
# out: 'another value'

# as for dynamic names:
some_name = 'key_3'
my_dict[some_name] = 'value 3'
print(my_dict)
# out: {'key_1': 'value 1', 'some_key': 'another value', 'key_3': 'value 3'}
  • Related