Home > Enterprise >  Parse XML-like file that has one extra field without tags
Parse XML-like file that has one extra field without tags

Time:10-18

I have the following file content:

<movie>
<title>Kung Fu Killer</title>
<genre>Martial arts</genre>
</movie>
http://www.imdb.com/title/tt2952602/

I parse this file into a Java object with the following code:

  XmlMapper mapper = new XmlMapper();
  mapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
  try {
      NfoFileXmlModel xmlFileContent = mapper.readValue(nfoFile.toFile(), NfoFileXmlModel.class);
  } catch (JsonParseException e) {
      log.debug("Failed to deserialize {} ({})", nfoFile, e.toString());
  }

@ToString
@Getter
@Setter
public class NfoFileXmlModel {
    @ToString
    @Getter
    @Setter
    public static class Art {
        String poster;
    }

    Art art;
    String year;
    String title;
}

This works well for the XML-like 'fields'. How can I parse the last line, which is so to say 'anonymous', since there is no tags around the value?

CodePudding user response:

You can read the content of the file, wrap it with an extra root tag and read it as an XML. XmlMapper allows to handle problems using DeserializationProblemHandler class. There is a handleUnknownProperty method which you can override and handle your case. See below example:

import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.DeserializationContext;
import com.fasterxml.jackson.databind.JsonDeserializer;
import com.fasterxml.jackson.databind.deser.DeserializationProblemHandler;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import lombok.Getter;
import lombok.Setter;
import lombok.ToString;

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;

public class XmlMapperApp {

    public static void main(String... args) throws Exception {
        File xmlFile = new File("./resource/test.xml").getAbsoluteFile();
        String content = "<NfoRoot>"   String.join("", Files.readAllLines(xmlFile.toPath()))   "</NfoRoot>";

        XmlMapper xmlMapper = new XmlMapper();
        xmlMapper.addHandler(new DeserializationProblemHandler() {
            @Override
            public boolean handleUnknownProperty(DeserializationContext ctxt, JsonParser p, JsonDeserializer<?> deserializer, Object beanOrClass, String propertyName) throws IOException {
                if (beanOrClass instanceof NfoRoot) {
                    ((NfoRoot) beanOrClass).setRestOfTheFile(p.readValueAs(String.class));
                    return true;
                }

                return super.handleUnknownProperty(ctxt, p, deserializer, beanOrClass, propertyName);
            }
        });

        NfoRoot value = xmlMapper.readValue(content, NfoRoot.class);
        System.out.println(value);
    }
}

@ToString
@Getter
@Setter
class NfoFileXmlModel {
    @ToString
    @Getter
    @Setter
    public static class Art {
        String poster;
    }

    Art art;
    String year;
    String title;
    String genre;
}

@ToString
@Getter
@Setter
class NfoRoot {
    NfoFileXmlModel movie;

    String restOfTheFile;
}

Above code prints:

NfoRoot(movie=NfoFileXmlModel(art=null, year=null, title=Kung Fu Killer, genre=Martial arts), restOfTheFile=http://www.imdb.com/title/tt2952602/)

CodePudding user response:

Turn it into XML by adding a start and end tag:

<doc>
<movie>
<title>Kung Fu Killer</title>
<genre>Martial arts</genre>
</movie>
http://www.imdb.com/title/tt2952602/
</doc>

and once it's been turned into XML, you can use all the XML technology you want.

Of course, this relies on the idea that if you add <doc> at the start and </doc> at the end, you will get well-formed XML. There's no way to be sure of that; if the last line were

http://www.imdb.com/title/tt2952602/?x=3&y=4

then this approach would fail. It would be much better if you could persuade the originator of this data to use XML to start with - standards are useful!

  • Related