Home > Software design >  How to remove unused > character in xml string in c#
How to remove unused > character in xml string in c#

Time:10-05

I have to remove some special characters and ">" in XML string. Load XML throwing Data root level error.

public T ConvertXmlFromByte<T>(byte[] data)
{
    T model = null;
    try
    {
        if (data != null)
        {
            XmlDocument xmlDoc = null;
            XmlSerializer serializer = null;
            string xml = "";
            xmlDoc = new XmlDocument();
            xml = Encoding.UTF8.GetString(data);
            //xml = Regex.Replace(xml, @"[^&;:()a-zA-Z0-9\=./><_-~-]", string.Empty);
            xmlDoc.LoadXml(xml);
        }
    }
    catch (Exception ex)
    {
        _customLogger.Error(ex.Message, ex);
    }
    return model;
}

Below is my XML string:

<?xml version="1.0" encoding="utf-8" standalone="no"?><Test xmlns="https://cdn.Test.go.cr/xml-schemas/v4.3/test" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   </Test>>

CodePudding user response:

You are trying to load the file as XML and remove what appears to be an extra character.

The problem is, that this extra character means that the text is not valid XML! Therefore, you can't load it using an XML parser because it is not valid so this is why you get this exception.

So you must therefore treat it as a string, find the offending characters and modify the string and save it again.

So you can use a simple Regex to do this. You don't really state the conditions, so I assume that anytime a double '>>' appears it is incorrect. But you need to amend as appropriate.

string contents = File.ReadAllText(@"c:\path\file.xml");
string output = Regex.Replace(contents, ">>", ">");


File.WriteAllText(@"c:\path\output.xml", output);     
  • Related