Home > Software engineering >  Deserializing Invalid XML with System.XML
Deserializing Invalid XML with System.XML

Time:03-02

I am attempting to deserialize some XML which I am not responsible for producing. It has a monolithic node and various branches for several modules. The problem is each module may have similar sub nodes that have different nodes and attributes but share the same name. These similar nodes are not namespaced. In abstract it will look something like this as the target type.

<Root>
    <Module1>
         <Node SomeAttribute="123" />
    </Module1>
    <Module2>
         <Node SomeOtherAttribute="Something" />
    </Module2>
</root>

I have seem various suggestions to annotated my pocos with a namespace to avoid the resulting exception when I try to construct a XmlSerializer using the Root type that has both Module1 and Module2 as members.

System.InvalidOperationException : Types 'Root.Module1.Item1' and 'Root.Module1.Item2' both use the XML type name, 'Item', from namespace ''. Use XML attributes to specify a unique XML name and/or namespace for the type.

I think if using System.Text.Json I wouldn't have this problem as the type is decided by the poco class structure not my the name of the node being deserialized.

Is there a way to deserialize this object in it's monolithic form, perhaps by annotating the Module1.Node and Module1.Node poco class with decorators?

I couldn't find the relevant decorators when I tried. I did succeed in stopping the XmlSerializer constructor exception but it stopped recognising the Node types and was unable to deserialize either.

My next step will to make separate XmlSerializer instances for each Module and try and see if I can do away with the Root object which felt inefficient anyway.

Here is an example of the setup in fiddle: https://dotnetfiddle.net/0twN0O

CodePudding user response:

I have a solution for you, but it will work only if you will fix your XML before using it (for example 123 should be with "123").

public class Node
{
    [XmlAttribute]
    public string SomeOtherAttribute { get; set; }

    [XmlAttribute]
    public int SomeAttribute { get; set; }
}

public class Module
{
    public Node Node { get; set; }
}

[XmlRoot("Root")]
public class OrderedItem
{
    [XmlElement("Module1")]
    public Module Module1 { get; set; }
    [XmlElement("Module2")]
    public Module Module2 { get; set; }
}

public class Program
{
    public static void Main(string[] args)
    {
        string xml = @"<Root>
                        <Module1>
                             <Node SomeAttribute = ""123"" /> 
                         </Module1> 
                         <Module2>
                              <Node SomeOtherAttribute = ""Something"" /> 
                          </Module2 >
                      </Root>";

        XmlSerializer serializer = new XmlSerializer(typeof(OrderedItem));
        using (TextReader reader = new StringReader(xml))
        {
            var result = (OrderedItem)serializer.Deserialize(reader);
        }

    }
}

CodePudding user response:

This is a little extension on @d-a's answer using interfaces to help keep the objects separate from the consumers point of view.

The answer to the following was super useful: XML serialization of interface property

I don't like the public method of the concrete type but I struggled to make private and deserialize with the ISerializable interface.

Might still have a go at another Serialiser to see how it goes https://github.com/ExtendedXmlSerializer/home.

using System.IO;
using System.Xml;
using System.Xml.Serialization;

[XmlType]
public class Node : Module1.Node1, Module2.Node2
{
    [XmlAttribute]
    public string SomeOtherAttribute { get; set; }

    [XmlAttribute]
    public int SomeAttribute { get; set; }
}

public class Module1
{   [XmlElement(ElementName="Node")]
    public Node _node { get; set; }
    [XmlIgnore]
    public Node1 Node { get {return (Node1)_node ;} }
    
    public interface Node1 {
        public int SomeAttribute { get; set; }
    }

}

public class Module2
{   [XmlElement(ElementName="Node")]
    public Node _node { get; set; }
    [XmlIgnore]
    public Node2 Node { get {return (Node2)_node ;} }
    
    public interface Node2 {
        public string SomeOtherAttribute { get; set; }
    }
}

[XmlRoot("Root")]
public class OrderedItem
{
    [XmlElement("Module1")]
    public Module1 Module1 { get; set; }
    [XmlElement("Module2")]
    public Module2 Module2 { get; set; }
}

public class Program
{
    public static void Main(string[] args)
    {
        string xml = @"<Root>
                        <Module1>
                             <Node SomeAttribute = ""123"" /> 
                         </Module1> 
                         <Module2>
                              <Node SomeOtherAttribute = ""Something"" /> 
                          </Module2 >
                      </Root>";

        XmlSerializer serializer = new XmlSerializer(typeof(OrderedItem));
        using (TextReader reader = new StringReader(xml))
        {
            var result = (OrderedItem)serializer.Deserialize(reader);
            System.Console.Out.WriteLine(result.Module1.Node.SomeAttribute);
            System.Console.Out.WriteLine(result.Module2.Node.SomeOtherAttribute);
        }

    }
}

https://dotnetfiddle.net/3AYdVT

CodePudding user response:

So long as you make the poco classes names unique. There is no need for the property names to be unique. Therefore, the types of Node should be unique but they the members of this unique type may both be called Node.

https://dotnetfiddle.net/0twN0O

using System.IO;
using System.Xml;
using System.Xml.Serialization;

public class Module1{
    public Node1 Node { get; set; }
    public class Node1 {
        [XmlAttribute]
        public int SomeAttribute { get; set; }
    }
}

public class Module2
{   
    public Node2 Node { get; set; }
    public class Node2 {
        [XmlAttribute]
        public string SomeOtherAttribute { get; set; }
    }
}

[XmlRoot("Root")]
public class OrderedItem
{
    [XmlElement("Module1")]
    public Module1 Module1 { get; set; }
    [XmlElement("Module2")]
    public Module2 Module2 { get; set; }
}

public class Program
{
    public static void Main(string[] args)
    {
        string xml = @"<Root>
                        <Module1>
                             <Node SomeAttribute = ""1232"" /> 
                         </Module1> 
                         <Module2>
                              <Node SomeOtherAttribute = ""Something"" /> 
                          </Module2 >
                      </Root>";

        XmlSerializer serializer = new XmlSerializer(typeof(OrderedItem));
        using (TextReader reader = new StringReader(xml))
        {   
            var result = (OrderedItem)serializer.Deserialize(reader);
            System.Console.Out.WriteLine(result.Module1.Node.SomeAttribute);
            System.Console.Out.WriteLine(result.Module2.Node.SomeOtherAttribute);
        }

    }
}
  • Related