i use beautifulsoup to parsing xml file so the parsing done by tag name but can i put another word for searching inside the tag?
Data = soup.find_all('Data')
for Data in Data:
Data = Data.get_text()
Data is name of the tag but can i select a word inside this tag to parsing it maybe like this
Data = soup.find_all("Data", name = '"ObjectClass')
for Data in Data: Data = Data.get_text() print (Data)
i tried this but get this error TypeError: Tag.find_all() got multiple values for argument 'name'
This is an XML example:
<Document>
<Data Name="ObjectClass">computer</Data>
<Data Name="AttributeLDAPDisplayName">ms-Mcs-AdmPwdExpirationTime</Data>
<Data Name="ObjectClass">computer</Data>
<Data Name="AttributeLDAPDisplayName">ms-Mcs-AdmPwdExpirationTime</Data>
</Document>
So I want to search on only name =object class
CodePudding user response:
This will get only Data
tags with Name="ObjectClass"
. It requires pip install bs4 lxml
for external libraries:
from bs4 import BeautifulSoup
xml = '''\
<Document>
<Data Name="ObjectClass">computer</Data>
<Data Name="AttributeLDAPDisplayName">ms-Mcs-AdmPwdExpirationTime</Data>
<Data Name="ObjectClass">computer</Data>
<Data Name="AttributeLDAPDisplayName">ms-Mcs-AdmPwdExpirationTime</Data>
<Other Name="ObjectClass">other</Other>
</Document>
'''
soup = BeautifulSoup(xml,'xml')
for data in soup.find_all('Data',Name='ObjectClass'):
print(data.get_text())
Output:
computer
computer
Note that case matters (Name
not name
).
CodePudding user response:
One liner without any external library :-)
import xml.etree.ElementTree as ET
xml = '''\
<Document>
<Data Name="ObjectClass">computer1</Data>
<Data Name="AttributeLDAPDisplayName">ms-Mcs-AdmPwdExpirationTime</Data>
<Data Name="ObjectClass">compute2r</Data>
<Data Name="AttributeLDAPDisplayName">ms-Mcs-AdmPwdExpirationTime</Data>
<Other Name="ObjectClass">other</Other>
</Document>
'''
root = ET.fromstring(xml)
object_class_data = [x.text for x in root.findall('.//Data[@Name="ObjectClass"]')]
print(object_class_data)
output
['computer1', 'compute2r']