Home > database >  C# Regex for prepending underscore to any tag elements starting with number
C# Regex for prepending underscore to any tag elements starting with number

Time:03-08

What Regex.Replace pattern can I use to prepend an underscore before any tag elements starting with a number?

e.g.

"<1ABC>Hello</1ABC><A8D>World</A8D><0>!</0>"

would become

"<_1ABC>Hello</_1ABC><A8D>World</A8D><_0>!</_0>"

CodePudding user response:

This regex can get same result, but I'm sure there could be better ones.

using System.Text.RegularExpressions;

string input = @"<1ABC>Hello</1ABC><A8D>World</A8D><0>!</0>";
string output = Regex.Replace(input, @"(</?)(\d[\d\w]*?)(>)", @"$1_$2$3");
Console.WriteLine(output);

CodePudding user response:

@Lei Yang's answer will fail if an element has attributes. Minimal change is required:

using System.Text.RegularExpressions;

string input = @"<1ABC id='abc'>Hello</1ABC><A8D>World</A8D><0>!</0>";
string output = Regex.Replace(input, @"(</?)(\d.*?)([ >])", @"$1_$2$3");
Console.WriteLine(output);

CodePudding user response:

Try this:

private static Regex rxTagsWithLeadingDigit = new Regex(@"
  (</?)    # open/close tag start, followed by
  (\d\w )  # a tag name that begins with a decimal digit, followed by
  (\s|/?>) # a whitespace character or end-of-tag
", RegexOptions.IgnorePatternWhitespace);

public string ensureTagsStartWithWordCharacter( string s )
{
  return rxTagsWithLeadingDigit.Replace( s , "$1_$2$3" );
}
  • Related