Home > front end >  Empty lines appear after the element is removed from the xml file
Empty lines appear after the element is removed from the xml file

Time:10-15

Some elements have been added to the xml file through the add method. But why does the xml file leave a blank line after deleting an element, and how to get rid of it? The main function is:

@Test
    public void test() {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        OperateXml operateXml = new OperateXml();
        try {
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document document = builder.parse(new ClassPathResource("chunkInfo/chunk.xml").getFile());
// operateXml.add(document);
            operateXml.deleteNodeById(document,"id_5");
            operateXml.saveDocument(document, new ClassPathResource("chunkInfo/chunk.xml").getFile());
} catch (Exception e) {
            System.out.println(e);
        }
}

The save function is:

public static void saveDocument(Document document, File xmlFile) {
        TransformerFactory tff = TransformerFactory.newInstance();
        try {
            Transformer tf = tff.newTransformer();
            tf.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "chunk.dtd");
            tf.setOutputProperty(OutputKeys.INDENT, "yes");
            tf.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
            tf.transform(new DOMSource(document), new StreamResult(xmlFile));
        } catch (TransformerException e) {
            e.printStackTrace();
        }
    }

Before deleting the element, the xml file is:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE save SYSTEM "chunk.dtd">
<save>
    <chunk id="id_1">
    <name>test_6</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>
    <chunk id="id_3">
    <name>test_6</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>
    <chunk id="id_4">
    <name>test_6</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>
<chunk id="id_5">
    <name>test_7</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>
</save>

After deleting the element (id="id_5"), the xml file is:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE save SYSTEM "chunk.dtd">
<save>
    <chunk id="id_1">
    <name>test_6</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>
    <chunk id="id_3">
    <name>test_6</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>
    <chunk id="id_4">
    <name>test_6</name>
    <type>log</type>
    <item>2,3,4,5</item>
  </chunk>

</save>

CodePudding user response:

The reason is likely because there are invisible [text] nodes between your elements (in the example below '\n' is a newline and '\t' is a tab indent):

<save>[text '\n
\t']<chunk id="id_1">[text '\n
\t\t']<name>test_6</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t\t']</chunk>[text \n
\t']<chunk id="id_3">[text '\n
\t\t']<name>test_6</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t']</chunk>[text '\n
\t']<chunk id="id_4">[text '\n
\t\t']<name>test_6</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t']</chunk>[text '\n
\t']<chunk id="id_5">[text '\n
\t\t']<name>test_7</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t']</chunk>[text '\n']
</save>

When you remove everything under <chunk id="id_5"> it leaves the [text] nodes around it.

<save>[text '\n
\t']<chunk id="id_1">[text '\n
\t\t']<name>test_6</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t\t']</chunk>[text \n
\t']<chunk id="id_3">[text '\n
\t\t']<name>test_6</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t']</chunk>[text '\n
\t']<chunk id="id_4">[text '\n
\t\t']<name>test_6</name>[text '\n
\t\t']<type>log</type>[text '\n
\t\t']<item>2,3,4,5</item>[text '\n
\t']</chunk>[text '\n
\t'][..where chunk used to be..][text '\n']
</save>

One option to resolve this would be to re-format the document.

Another is to get the reference to the element you want to delete, check its preceding sibling node, and if it's a blank text node then remove it as well.

  • Related