Home > Back-end >  Compare two xml files ignoring certain elements using xpath in Java
Compare two xml files ignoring certain elements using xpath in Java

Time:02-17

Can anyone tell me how to compare two xml files ignoring certain elements using xpath?

For example, I need to compare below two xml files, but need to ignore 'Date' element, by passing the Xpath(//Set[1]/Product[1]/Date) of this element during the run. The element to ignore could vary each time.

Xml 1:-

<?xml version="1.0" encoding="utf-8"?>
<Set
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:abc:product:v3" xsi:schemaLocation="urn:abc:product:v3 abc.xsd">
    <Product>
        <id>1</id>
        <ref>1</ref>
        <Date>2021-09-19</Date>
        <company>JJ</company>
        <lastModified>2021-09-20T21:00:30</lastModified>
        <productOne>
            <partProduct>
                <Level>3.0</Level>
                <Flag>0</Flag>
                <Code>EN</Code>
            </partProduct>
        </productOne>
    </Product>
</Set>

Xml 2:-

<?xml version="1.0" encoding="utf-8"?>
<Set
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:abc:product:v3" xsi:schemaLocation="urn:abc:product:v3 abc.xsd">
    <Product>
        <id>2</id>
        <ref>2</ref>
        <Date>2021-09-20</Date>
        <company>JJ</company>
        <lastModified>2021-09-20T21:00:30</lastModified>
        <productOne>
            <partProduct>
                <Level>3.0</Level>
                <Flag>0</Flag>
                <Code>EN</Code>
            </partProduct>
        </productOne>
    </Product>
</Set>

CodePudding user response:

You need to transform both files into a form where they compare equal, by removing the elements you want to ignore. You would typically do this using XSLT. After the transformation you could either compare the results using the XPath 2.0 function deep-equal(), or serialise both documents as canonical XML and compare the files at the character or binary level.

UPDATE

Thanks for explaining the question more clearly.

I would do this by running XQuery Update to delete the nodes selected by the path expression, and then comparing the resulting documents either using fn:deep-equal(), or by doing canonical serialization and comparing the resulting lexical forms.

As an alternative to XQuery Update you could use xmlstarlet or Saxon's Gizmo tool.

But it might depend on what you want from the comparison. The above is fine if you want a yes/no answer, but getting details of the differences is more difficult. You could write your own query to find the differences, or use a tool such as DeltaXML.

  • Related