Home > Blockchain >  preserve , when reading XML
preserve , when reading XML

Time:11-26

Xml content like following:

<xml>
  <item content="abcd &#xD; abcd &#xA; abcd" />
</xml>

When using XmlDocument to read the content of content attribute, &#xD; and &#xA; are automatically escaped.

Code:

XmlDocument doc = new XmlDocument();
var content = doc.SelectSingleNode("/xml/item").Attributes["content"].Value;

How can get the raw text without char escaping?

CodePudding user response:

If these characters were written to the lexical XML stream without escaping, then they would be swallowed by the XML parser when the stream is read by the recipient, as a result of the XML line-ending normalisation rules. So you've got it the wrong way around: the reason they are escaped is in order to preserve them; if they weren't escaped, they would be lost.

CodePudding user response:

I got a workaround, it works for me:

private static string GetAttributeValue(XmlNode node, string attributeName)
{
    if (node == null || string.IsNullOrWhiteSpace(attributeName))
    {
        throw new ArgumentException();
    }

    const string CharLF = "&#xA;";
    const string CharCR = "&#xD;";            

    string xmlContent = node.OuterXml;

    if (!xmlContent.Contains(CharLF) && !xmlContent.Contains(CharCR))
    {
        // no special char, return its original value directly
        return node.Attributes[attributeName].Value;
    }

    string value = string.Empty;
    if (xmlContent.Contains(attributeName))
    {
        value = xmlContent.Substring(xmlContent.IndexOf(attributeName)).Trim();
        value = value.Substring(value.IndexOf("\"")   1);
        value = value.Substring(0, value.IndexOf("\""));
    }

    return value;
}
  • Related