i need to replace some charachters in a string with a \
plus the original character
so giving thats string and array
string origin = "words&sales -test\strange";
string[] specialChars = new string[]{"\", "&", "-", "?",......};
i want to get
"words\&sales \-test\\strange"
notice that the \
itself is a character to find and replace
thanks
CodePudding user response:
Generally speaking, the fastest way to build String
values in C#/.NET is with a StringBuilder
, even if you're transforming another String
value.
The other problem is the "best" way to determine which char values should be escaped or not: if the set of escapable characters is fixed at compile-time, then use a switch()
statement, as that will be compiled to a native jump-table, which is faster than using a runtime HashSet<Char>
for determining set-membership:
e.g.:
static String Escape( String input )
{
StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.
foreach( Char c in input )
{
switch( c )
{
case '\\':
case '&':
case '-':
case '?':
_ = sb.Append( '\\' ).Append( c );
break;
default:
_ = sb.Append( c );
break;
}
}
return sb.ToString();
}
If the set of escapable character is defined at runtime then using a HashSet<Char>
will likely be the best overall option - though if you know you're only processing chars with Unicode code-points within a limited range (say ASCII-compatible chars in the range 0x00 to 0x7F
) then you could use a Boolean[127]
array to store the escape flag map.
Using a HashSet<Char>
, it would be like this:
static String Escape( String input, IEnumerable<Char> escapableChars )
{
HashSet<Char> escapeThese = new HashSet<Char>( escapableChars );
StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.
foreach( Char c in input )
{
if( escapeThese.Contains( c ) )
{
_ = sb.Append( '\\' ).Append( c );
}
else
{
_ = sb.Append( c );
}
}
return sb.ToString();
}
Of course, the above code can be optimized further: some suggestions:
- First check to see if the
String input
even has any escapable characters in the first place: if none of its characters are escapable then just returninput
directly without having created a newStringBuilder
. - Create an (on-demand) pool of
StringBuilder
instances instead of creating new instances on every call. - Allow
ReadOnlySpan<Char>
instead ofString
for input and writing output toSpan<Char>
- you'll need an initial step to calculate the required minimum size of theSpan<Char>
first though, and pass that info back to the caller.- The same minimum-size calculation can be done to have an exactly correct
capacity:
value for theStringBuilder
instead of my (lazy) 25% estimate.
- The same minimum-size calculation can be done to have an exactly correct
- Add memoization: use a Bloom filter and output cache keyed by the
input
value.