Home > Mobile >  Java partial parse of XML file and attributes
Java partial parse of XML file and attributes

Time:10-02

I need to parse an xml file where I don't know the whole structure but I know a few Nodes and attributes that I wish to parse, lets say I have this file structure:

<Root>
//lots and lots of irrelevant info here
               <Name>
                    <EffectiveName Value="My-Name" />
                    <UserName Value="" />
                    <Annotation Value="" />
                    <MemorizedFirstClipName Value="" />
                </Name>
//lots and lots of irrelevant info here
               <KeyTracks>
                                                <KeyTrack Id="43">
                                                    <Notes>
                                                        <MidiNoteEvent Time="0" Duration="0.5" Velocity="120" VelocityDeviation="0" OffVelocity="64" Probability="1" IsEnabled="true" NoteId="73" />
                                                    </Notes>
                                                    <MidiKey Value="0" />
                                                </KeyTrack>
                                                <KeyTrack Id="44">
                                                    <Notes>
                                                        <MidiNoteEvent Time="0.5" Duration="0.5" Velocity="120" VelocityDeviation="0" OffVelocity="64" Probability="1" IsEnabled="true" NoteId="75" />
                                                    </Notes>
                                                    <MidiKey Value="1" />
                                                </KeyTrack>
                                                <KeyTrack Id="45">
                                                    <Notes>
                                                        <MidiNoteEvent Time="1" Duration="0.5" Velocity="120" VelocityDeviation="0" OffVelocity="64" Probability="1" IsEnabled="true" NoteId="77" />
                                                    </Notes>
                                                    <MidiKey Value="2" />
                                                </KeyTrack>
                                                <KeyTrack Id="46">
                                                    <Notes>
                                                        <MidiNoteEvent Time="1.5" Duration="0.5" Velocity="120" VelocityDeviation="0" OffVelocity="64" Probability="1" IsEnabled="true" NoteId="79" />
                                                    </Notes>
                                                    <MidiKey Value="3" />
                                                </KeyTrack>
                                            </KeyTracks>
//lots and lots of irrelevant info here
</Root>

It is far bigger then that but just for the sake of the example.

I wish to extract and parse the file to the following objects:

CustomObject.class

public class CustomObject{
   private String effectiveName;
   private List<KeyTrack> keyTracks;
}

KeyTracks.class

public class KeyTrack {
    private Integer keyTrackId;
    private MidiNoteEvent midiNoteEvent;
    private Integer midiKey;
}

MidiNoteEvent .class

public class MidiNoteEvent {
    private Double time;
    private Integer velocity;
    private Double duration;
    private Boolean isEnabled;
}

I am trying to do it as generic as I can so I won't have to change my parser if I need to add another node or attribute, so switch/case or if/else is not right for my case there can be hundreds of nodes and additions and I don't want to have to change multiple classes if need to parse extra info.

I tried to create an enum for the nodes I need, but I can't find the sweet spot to be able to grab the nodes correctly.

These are my parsers' function for now

private final Map<String, Object> map;

 @Override
    public ProjectTrack parse(Node node) {
        parseToMap(node);
        return mapper.convertValue(map, CustomObject.class);
    }

    private void parseToMap(Node node){
        if(Arrays.stream(RelevantMidiTrackNodes.values()).anyMatch(
                e -> e.getNodeName().equals(node.getNodeName()))
        ){
            System.out.print(node.getNodeName()    ": ");
            for (int i = 0; i < node.getAttributes().getLength(); i  ) {
                Attr attribute = (Attr)node.getAttributes().item(i);
                System.out.print(attribute.getNodeName()    " - "   attribute.getValue()   ", ");
                map.put(attribute.getNodeName(), attribute.getValue());
            }
            System.out.println();
        }
        NodeList nodeList = node.getChildNodes();
        for (int i = 0; i < nodeList.getLength(); i  ) {
            Node currentNode = nodeList.item(i);
            if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
                //calls this method for all the children which is Element
                parseToMap(currentNode);
            }
        }
    }

CodePudding user response:

Here's is an example of how to get nodes with xpath and unmarshall them to pojos. In this case, get MidiNoteEvent

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import jakarta.xml.bind.JAXBContext;
import jakarta.xml.bind.Unmarshaller;

public class MainJaxbXpath {

    public static void main(String[] args) throws Exception {   
            FileInputStream fileIS;
            fileIS = new FileInputStream(System.getProperty("user.home")   "/tmp/tmp.xml");

            DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder;
            builder = builderFactory.newDocumentBuilder();
            
            JAXBContext jc = JAXBContext.newInstance( MidiNoteEvent.class );
            Unmarshaller u = jc.createUnmarshaller();

            Document xmlDocument;
            xmlDocument = builder.parse(fileIS);

            XPath xPath = XPathFactory.newInstance().newXPath();
            NodeList nodeList =(NodeList) xPath.compile("//MidiNoteEvent").evaluate(xmlDocument, XPathConstants.NODESET);
            
            MidiNoteEvent o = (MidiNoteEvent) u.unmarshal( nodeList.item(0) );
    }
}

MidiNoteEvent.class with proper annotations

@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name="MidiNoteEvent")
public class MidiNoteEvent {
    
    @XmlAttribute(name = "Time")
    private Double time;
    @XmlAttribute(name = "Velocity")
    private Integer velocity;
    @XmlAttribute(name = "Duration")
    private Double duration;
    @XmlAttribute(name = "IsEnabled")
    private Boolean isEnabled;
    // getters/setters
}

JAXBContext can be created for more than 1 class

JAXBContext jc = JAXBContext.newInstance( KeyTrack.class, MidiNoteEvent.class );

CodePudding user response:

My personal tool of choice for pretty well all XML processing is XSLT. When you mix XSLT and Java the details vary from one XSLT processor to another, but this is how to do it using XSLT 3.0 and Saxon. This example uses the open source version.

First, write a stylesheet that extracts the data you want as an XDM (XPath Data Model) map:

<xsl:transform version="3.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
  <xsl:map>
     <xsl:apply-templates select="//EffectiveName, //KeyTrack"/>
  </xsl:map>
</xsl:template>
<xsl:template match="//EffectiveName">
  <xsl:map-item key="'EffectiveName'" select="@Value"/>
</xsl:template>
<xsl:template match="//KeyTrack">
  <xsl:map-item key="@Id" select="map{
      "time":      number(Nodes/@Time),
      "velocity":  number(Notes/@Velocity),
      "duration":  number(Notes/@Duration),
      "isEnabled": string(Notes/@IsEnabled),
      "midiKey":   number(MidiKey/@Value),
     }"/>
</xsl:template>
</xsl:transform>

Run this transformation to deliver a Saxon XdmMap object:

Processor proc = new Processor(false);
DocumentBuilder builder = proc.newDocumentBuilder();
XdmNode source = builder.build("input.xml");
XsltCompiler comp = proc.newXsltCompiler();
Xslt30Transformer trans = comp.compile("stylesheet.xsl").load30();
XdmMap result = (XdmMap)trans.applyTemplates(source);

Then process the XdmMap to construct the Java POJOs you want to use in your application.

for (Map.Entry<> entry : result.entrySet()) {
   String key = entry.getKey().getStringValue();
   if (key.equals("EffectiveName") {
     ...
   } else {
      ...
   }
}

If you prefer, with very little change you could make the stylesheet output the data you want in JSON, and then use a JSON library to build your POJOs.

Actually, if I were doing it, I would ask whether you actually need to process the data in Java at all - I would probably try and find a way to do the whole job in XSLT. But I don't know what the "whole job" is, so that may be unrealistic.

  • Related