Home > Software engineering >  How to parse XML header properties in Java?
How to parse XML header properties in Java?

Time:05-12

I am trying to parse a XML document.

I have successfully parsed the attribute having id tags but i wanted to parse the header attributes also;

<?xml version="1.0" encoding="UTF-8"?>
<XMLExample
    xmlns="http://www.example.com"
appID="YYYYY" txnID="XXXX" userID="system" ver="v2.6" lk="LLLL"
ts="Wed Apr 09 11:33:40 IST 2014" scope="XYXYXY" filter="LALALLA"
idType="id">
    <id>000000</id>
</XMLExample>

I am getting the value of "id" here.

But I am trying to get the "appId", "txnID", "userID" from the XML header.

Thanks in advance

CodePudding user response:

If you want to get all or some of the attributes from the XMLExample node (id is an element) within an XML String using the DOM parser then you could do it this way:

Example Demo Only:

String xmlString = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
          "<XMLExample\n"
          "    xmlns=\"http://www.example.com\"\n"
          "appID=\"YYYYY\" txnID=\"XXXX\" userID=\"system\" ver=\"v2.6\" lk=\"LLLL\"\n"
          "ts=\"Wed Apr 09 11:33:40 IST 2014\" scope=\"XYXYXY\" filter=\"LALALLA\"\n"
          "idType=\"id\">\n"
          "    <id>000000</id>\n"
          "</XMLExample>";

String xPathExpression = "//XMLExample";

org.w3c.dom.Document documento = null;
javax.xml.xpath.XPath xpath = null;
org.w3c.dom.NodeList nodos = null;

try {
    // Carga del documento xml
    javax.xml.parsers.DocumentBuilderFactory factory = 
                      javax.xml.parsers.DocumentBuilderFactory.newInstance();
    javax.xml.parsers.DocumentBuilder builder = factory.newDocumentBuilder();
    InputStream inputStream = new ByteArrayInputStream(xmlString.getBytes());
    documento = builder.parse(inputStream);
        
    // If from file instead of XML String...
    //documento = builder.parse(new File("./src/hmmmm_Testing/Some_XML_File.xml"));
}
catch (Exception e) {
    System.out.println(e.getMessage());
}

try {
    // Preparación de xpath
    xpath = javax.xml.xpath.XPathFactory.newInstance().newXPath();

    // Consultas
    nodos = (org.w3c.dom.NodeList) xpath.evaluate(xPathExpression, documento, javax.xml.xpath.XPathConstants.NODESET);
}
catch (Exception e) {
    System.out.println(e.getMessage());
}

for (int i = 0; i < nodos.getLength(); i  ) {
    System.out.println("************* Using DOM **************");
    // You can of course apply the following to variables if you like.
    System.out.println("xmlnx:  ->  "   nodos.item(i).getAttributes().getNamedItem("xmlns").getNodeValue());
    System.out.println("appID:  ->  "   nodos.item(i).getAttributes().getNamedItem("appID").getNodeValue());
    System.out.println("txnID:  ->  "   nodos.item(i).getAttributes().getNamedItem("txnID").getNodeValue());
    System.out.println("userID: ->  "   nodos.item(i).getAttributes().getNamedItem("userID").getNodeValue());
    System.out.println("ver:    ->  "   nodos.item(i).getAttributes().getNamedItem("ver").getNodeValue());
    System.out.println("lk:     ->  "   nodos.item(i).getAttributes().getNamedItem("lk").getNodeValue());
    System.out.println("ts:     ->  "   nodos.item(i).getAttributes().getNamedItem("ts").getNodeValue());
    System.out.println("scope:  ->  "   nodos.item(i).getAttributes().getNamedItem("scope").getNodeValue());
    System.out.println("filter: ->  "   nodos.item(i).getAttributes().getNamedItem("filter").getNodeValue());
    System.out.println("idType: ->  "   nodos.item(i).getAttributes().getNamedItem("idType").getNodeValue());
    System.out.println("**************************************");
}

This will print the following to the Console Window:

************* Using DOM **************
xmlnx:  ->  http://www.example.com
appID:  ->  YYYYY
txnID:  ->  XXXX
userID: ->  system
ver:    ->  v2.6
lk:     ->  LLLL
ts:     ->  Wed Apr 09 11:33:40 IST 2014
scope:  ->  XYXYXY
filter: ->  LALALLA
idType: ->  id
**************************************

Or for small XML Strings like this one you could use the getBetween() method, for example:

String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
          "<XMLExample\n"
          "    xmlns=\"http://www.example.com\"\n"
          "appID=\"YYYYY\" txnID=\"XXXX\" userID=\"system\" ver=\"v2.6\" lk=\"LLLL\"\n"
          "ts=\"Wed Apr 09 11:33:40 IST 2014\" scope=\"XYXYXY\" filter=\"LALALLA\"\n"
          "idType=\"id\">\n"
          "    <id>000000</id>\n"
          "</XMLExample>";

// Remove indenting.
xml = xml.replaceAll("\\s ", " ");

// Get all values from all attributes and the id element...
System.out.println("************* Using getBetween() **************");
// You can of course apply the following to variables if you like.
System.out.println("xmlns:  ->  "   getBetween(xml, "xmlns=\"", "\"")[0]);
System.out.println("appID:  ->  "   getBetween(xml, "appID=\"", "\"")[0]);
System.out.println("txnID:  ->  "   getBetween(xml, "txnID=\"", "\"")[0]);
System.out.println("userID: ->  "   getBetween(xml, "userID=\"", "\"")[0]);
System.out.println("ver:    ->  "   getBetween(xml, "ver=\"", "\"")[0]);
System.out.println("lk:     ->  "   getBetween(xml, "lk=\"", "\"")[0]);
System.out.println("ts:     ->  "   getBetween(xml, "ts=\"", "\"")[0]);
System.out.println("scope:  ->  "   getBetween(xml, "scope=\"", "\"")[0]);
System.out.println("filter: ->  "   getBetween(xml, "filter=\"", "\"")[0]);
System.out.println("idType: ->  "   getBetween(xml, "idType=\"", "\"")[0]);
System.out.println("The ID: ->  "   getBetween(xml, "<id>", "</id>")[0]);
System.out.println("**************************************");
System.out.println();

This will print the following to the Console Window:

************* Using getBetween() **************
xmlns:  ->  http://www.example.com
appID:  ->  YYYYY
txnID:  ->  XXXX
userID: ->  system
ver:    ->  v2.6
lk:     ->  LLLL
ts:     ->  Wed Apr 09 11:33:40 IST 2014
scope:  ->  XYXYXY
filter: ->  LALALLA
idType: ->  id
The ID: ->  000000
**************************************

The getBetween() Method:

/**
 * Retrieves any string data located between the supplied string leftString
 * parameter and the supplied string rightString parameter.<br><br>
 * <p>
 * This method will return all instances of a substring located between the
 * supplied Left String and the supplied Right String which may be found
 * within the supplied Input String.<br>
 *
 * @param inputString (String) The string to look for substring(s) in.<br>
 *
 * @param leftString  (String) What may be to the Left side of the substring
 *                    we want within the main input string. Sometimes the
 *                    substring you want may be contained at the very
 *                    beginning of a string and therefore there is no
 *                    Left-String available. In this case you would simply
 *                    pass a Null String ("") to this parameter which
 *                    basically informs the method of this fact. Null can
 *                    not be supplied and will ultimately generate a
 *                    NullPointerException.<br>
 *
 * @param rightString (String) What may be to the Right side of the
 *                    substring we want within the main input string.
 *                    Sometimes the substring you want may be contained at
 *                    the very end of a string and therefore there is no
 *                    Right-String available. In this case you would simply
 *                    pass a Null String ("") to this parameter which
 *                    basically informs the method of this fact. Null can
 *                    not be supplied and will ultimately generate a
 *                    NullPointerException.<br>
 *
 * @param options     (Optional - Boolean - 2 Parameters):<pre>
 *
 *      ignoreLetterCase    - Default is false. This option works against the
 *                            string supplied within the leftString parameter
 *                            and the string supplied within the rightString
 *                            parameter. If set to true then letter case is
 *                            ignored when searching for strings supplied in
 *                            these two parameters. If left at default false
 *                            then letter case is not ignored.
 *
 *      trimFound           - Default is true. By default this method will trim
 *                            off leading and trailing white-spaces from found
 *                            sub-string items. General sentences which obviously
 *                            contain spaces will almost always give you a white-
 *                            space within an extracted sub-string. By setting
 *                            this parameter to false, leading and trailing white-
 *                            spaces are not trimmed off before they are placed
 *                            into the returned Array.</pre>
 *
 * @return (1D String Array) Returns a Single Dimensional String Array
 *         containing all the sub-strings found within the supplied Input
 *         String which are between the supplied Left String and supplied
 *         Right String. You can shorten this method up a little by
 *         returning a List&lt;String&gt; ArrayList and removing the 'List
 *         to 1D Array' conversion code at the end of this method. This
 *         method initially stores its findings within a List Interface
 *         anyways.
 */
public static String[] getBetween(String inputString, String leftString, String rightString, boolean... options) {
    // Return null if nothing was supplied.
    if (inputString.isEmpty() || (leftString.isEmpty() && rightString.isEmpty())) {
        return null;
    }

    // Prepare optional parameters if any supplied.
    // If none supplied then use Defaults...
    boolean ignoreCase = false;      // Default.
    boolean trimFound = true;        // Default.
    if (options.length > 0) {
        if (options.length >= 1) {
            ignoreCase = options[0];
            if (options.length >= 2) {
                trimFound = options[1];
            }
        }
    }

    // Remove any control characters from the
    // supplied string (if they exist).
    String modString = inputString.replaceAll("\\p{Cntrl}", "");

    // Establish a List String Array Object to hold
    // our found substrings between the supplied Left
    // String and supplied Right String.
    List<String> list = new ArrayList<>();

    // Use Pattern Matching to locate our possible
    // substrings within the supplied Input String.
    String regEx = Pattern.quote(leftString)   "{1,}"
              (!rightString.isEmpty() ? "(.*?)" : "(.*)?")
              Pattern.quote(rightString);
    if (ignoreCase) {
        regEx = "(?i)"   regEx;
    }

    Pattern pattern = Pattern.compile(regEx);
    Matcher matcher = pattern.matcher(modString);
    while (matcher.find()) {
        // Add the found substrings into the List.
        String found = matcher.group(1);
        if (trimFound) {
            found = found.trim();
        }
        list.add(found);
    }
    return list.toArray(new String[list.size()]);
}

Of course the getBetween() method can be used for parsing out items from other strings than XML as well.

  • Related