Inorder to extract a tag, you need to use the tag as an attribute to the Tag
/BeautifulSoup
object, e.g. To extract the <head>
tag, I need to do this soupobject.head
I'm still beginner in programming and python but from my understanding and quick google search, object attributes are variables belonging to that objects. I mean I can write a script that have a variable named p
and have a condition that when my script run, if it find a <p>
tag, it will then parse any relevant data from it and then assign it to the p
variable I made, but to write a script that itself will "define" a variable and name it according to html tag name that I don't know how.
I hope I explaining it enough. I tried to understand the beautifulsoup source code but honestly I still having trouble understanding most of it.
My only assumption/theory on how it able to that, is by creating a string format of a python code then import that, I don't know if that possible
CodePudding user response:
Have a look at data model class customization via special methods and particularly at customizing attribute access via __getattr__()
and __getattribute__()
magic methods
In this particular case (bs4
), you can have a look at bs4 source code for Tag
class, where they define Tag.__getattr__()
magic method. Note that BeautifulSoup
class inherits from Tag
Also not that soup.head
is not the only way to access head
tag. you can do soup.find('head')
- that is exactly what they do in Tag.__getattr__()
.
To expand with an example
class Foo:
def __init__(self):
self.spam = 'spam'
def __getattr__(self, name):
return f'Attribute "{name}" returned from __getattr__'
foo = Foo()
print(foo.spam)
print(foo.eggs)
output:
spam
Attribute "eggs" returned from __getattr__
CodePudding user response:
In general, it is not considered a good practice to have varaible variable names. Some languages even make it impossible to do so. In order to achieve the same thing, you can use a dictionary object which can have variable key-names and variable values.
my_dict = {'key_1': 'value 1'}
print(my_dict['key_1'])
# out: 'value 1'
my_dict['some_key'] = 'another value'
# now your dictionary looks like this:
# {'key_1': 'value 1', 'some_key': 'another value'}
print(my_dict['some_key'])
# out: 'another value'
# as for dynamic names:
some_name = 'key_3'
my_dict[some_name] = 'value 3'
print(my_dict)
# out: {'key_1': 'value 1', 'some_key': 'another value', 'key_3': 'value 3'}