Let's assume that we have xml file:
<School Name = "school1">
<Class Name = "class A">
<Student Name = "student"/>
<Student/>
<!-- -->
</Class>
</School>
And I have a python script that using parsing. I want to print the line of a tag. For example I want to print lines of tags that have no "Name" attribute. Is it possible ? I saw an example with inheritance ElementTree but couldn't understand it.
import xml.etree.ElementTree as ET
def read_root(root):
for x in root:
print(x.lineNum)
read_root(x)
def main():
fn = "a.xml"
try:
tree = ET.parse(fn)
except ET.ParseError as e:
print("\nParse error:", str(e))
print("while reading: " fn)
exit(1)
root = tree.getroot()
read_root(root)
CodePudding user response:
Your question is so unclear. Anyways, if you just want to check if the tag has a Name
attribute and want to print that line number, you can use etree
from lxml
as shown below:
from lxml import etree
doc = etree.parse('test.xml')
for element in doc.iter():
# Check if the tag has a "Name" attribute
if "Name" not in element.attrib:
print(f"Line {element.sourceline}: {element.tag}"))
output:
Line 4: Student
Line 5: <cyfunction Comment at 0x13b8e6dc0>
CodePudding user response:
You need a parser like ET.XMLPullParser
what can read "comment"
and "process instructions"
, "namespces"
, "start"
and "end"
events.
If your XML file 'comment.xml'
looks like:
<?xml version="1.0" encoding="UTF-8"?>
<School Name = "school1">
<Class Name = "class A">
<Student Name = "student"/>
<Student/>
<!-- Comment xml -->
</Class>
</School>
You can parse to find TAG's
without the attribute "Name"
and comments:
import xml.etree.ElementTree as ET
#parser = ET.XMLPullParser(['start', 'end', "comment", "pi", "start-ns", "end-ns"])
parser = ET.XMLPullParser([ 'start', 'end', 'comment'])
with open('comment.xml', 'r', encoding='utf-8') as xml:
feedstring = xml.readlines()
for line in enumerate(feedstring):
parser.feed(line[1])
for event, elem in parser.read_events():
if elem.get("Name"):
pass
else:
print(f"{line[0]} Event:{event} | {elem.tag}, {elem.text}")
Output:
4 Event:start | Student, None
4 Event:end | Student, None
5 Event:comment | <function Comment at 0x00000216C4FDA200>, Comment xml