Home > Software design >  Excess child elements when parsing XML in Java
Excess child elements when parsing XML in Java

Time:12-16

I have the XML-file:

<?xml version="1.0" encoding="UTF-8"?>
<questions>
    <question>
        <name>First question</name>
        <true>2</true>
        <answers>
            <answer>First answer</answer>
            <answer>Second answer</answer>
            <answer>Third answer</answer>
            <answer>Fourth answer</answer>
        </answers>
    </question>
    <question>
        <name>Second question</name>
        <true>3</true>
        <answers>
            <answer>First answer</answer>
            <answer>Second answer</answer>
            <answer>Third answer</answer>
            <answer>Fourth answer</answer>
        </answers>
    </question>
</questions>

Why when the Java code below is executed, it returns 9 elements instead of 4, and the incorrect 5 elements contain one line feed and 3 tabs that are between <answers> and <answer> (one), </answer> and <answer> (three), </answer> and </answers> (one) in XML:

File file = new File(path);
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document doc = documentBuilder.parse(file);
NodeList answers = doc.getElementsByTagName("answers").item(n).getChildNodes();

Next, I do a check to cut off the wrong elements:

if (answers.item(i).getTextContent().trim().length() > 0)

I would be grateful if you could tell me a better way.

CodePudding user response:

It's not returning 9 elements - it's returning 9 nodes, which is correct. (After all, you're asking for the child nodes of the answers element.) Those white-space only text nodes are valid nodes. If you want elements, just ignore any node where Node.getNodeType() doesn't return Node.ELEMENT_NODE.

Alternatively, just call getElementsByTagName("answer") on the answers element to get just the elements. That's assuming you're happy to ignore any non-answer elements though. For example:

Element answersElement = (Element) doc.getElementsByTagName("answers").item(n);
NodeList answerElements = answersElement.getElementsByTagName("answer");
  • Related