I need to process some XML input which has HTML code in some tags. For these tags I want the raw content to process it later. I followed this answer and used XmlElement
which works fine in most cases. The only problem I'm facing are self closing tags.
[Serializable]
public class Root
{
public XmlElement Description { get; set; }
public string Name { get; set; }
}
var serializer = new XmlSerializer(typeof(Root));
var obj1 = serializer.Deserialize(new StringReader(@"<Root><Description><p>test</p></Description><Name>Test</Name></Root>"));
// Description: "Element, Name=\"p\""
// Name: "Test"
var obj2 = serializer.Deserialize(new StringReader(@"<Root><Description></Description><Name>Test</Name></Root>"));
// Description: null
// Name: "Test"
var obj3 = serializer.Deserialize(new StringReader(@"<Root><Description/><Name>Test</Name></Root>"));
// Description: "Element, Name=\"Name\""
// Name: null
obj1
and obj2
are ok (obj2.Description == ""
would be better) but in obj3
the Description
member is greedy and contains the Name
part.
Is there a workaround for this problem?
CodePudding user response:
A possible workaround is to declare a custom class for the Description
property, matching any content inside the element using the [XmlAnyElement] attribute:
public class Root
{
public Description Description { get; set; }
public string Name { get; set; }
}
public class Description
{
[XmlAnyElement]
public List<XmlElement> Content { get; set; }
}
The only drawback is this won't work for mixed content. In other words, this will deserialize well:
<Description><p>test</p></Description>
but this won't, deserializing the <span>
only:
<Description>some <span>other</span> text</Description>
Should you need mixed content, implement IXmlSerializable
on the Description
class.
However, it does work for <Description/>
. That being said, I do agree with @MarcGravell that it's a bug in XmlSerializer
and shall be reported.
CodePudding user response:
As suggested I reported this issue.
I use this code now to workaround the issue:
using System;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
using System.Xml.Schema;
public class Program
{
public static void Main()
{
var serializer = new XmlSerializer(typeof(Root));
var obj1 = serializer.Deserialize(new StringReader(@"<Root><Description><p>test</p></Description><Name>Test</Name></Root>"));
Console.WriteLine(obj1); // Description: <p>test</p> / Name: Test
var obj2 = serializer.Deserialize(new StringReader(@"<Root><Description></Description><Name>Test</Name></Root>"));
Console.WriteLine(obj2); // Description: / Name: Test
var obj3 = serializer.Deserialize(new StringReader(@"<Root><Description/><Name>Test</Name></Root>"));
Console.WriteLine(obj3); // Description: / Name: Test
}
public class Root
{
public HtmlElement Description { get; set; }
public string Name { get; set; }
public override string ToString() => $"Description: {Description.Content} / Name: {Name}";
}
public class HtmlElement : IXmlSerializable
{
public string Content { get; set; } = "";
public XmlSchema GetSchema() => null;
public void ReadXml(XmlReader reader)
{
reader.MoveToContent();
if (!reader.IsEmptyElement && reader is XmlTextReader xtr)
{
Content = xtr.ReadInnerXml();
}
}
public void WriteXml(XmlWriter writer) => throw new NotImplementedException();
}
}