When I try to parse a XML element (tag) "Name" with BeatifulSoup4
exemplary_xml = '''
<SomeTag>
<UsualTag>abc</UsualTag>
<Name>xyz</Name>
</SomeTag>
'''
soup = BeautifulSoup(exemplary_xml, parser="xml")
print(soup.sometag.usualtag.string)
print(soup.sometag.name.string)
I'm getting an error cause it conflicts with the API .name
for accessing the tags name:
abc
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [16], in <module>
8 soup = BeautifulSoup(exemplary_xml, parser="lxml")
9 print(soup.sometag.usualtag.string)
---> 10 print(soup.sometag.name.string)
AttributeError: 'str' object has no attribute 'string'
How can I get the string value of a XML element/tag named "name"?
CodePudding user response:
The way you're using the xml
via bs4
is odd and deprecated. Use features
and then either find()
or find_all()
.
For example:
from bs4 import BeautifulSoup
exemplary_xml = '''
<SomeTag>
<UsualTag>abc</UsualTag>
<Name>xyz</Name>
</SomeTag>
'''
soup = BeautifulSoup(exemplary_xml, features="xml")
print(soup.find("UsualTag").string)
print(soup.find("Name").string)
Output:
abc
xyz
CodePudding user response:
Right now I use a workaround: soup.sometag.find("name").string
. Performance-wise this is not optimal. Probably there is some better way of identifying the XML element.