I was hit with a predicament that I can't quite figure out. I am evaluating many large text strings containing large numbers of spaces between words. I have figured out the text to display properly I need to replace about half of the spaces in each segment with non breaking space characters. This varies if the number of spaces are even or odd. I have the replace boiled down to:
if (numberOfSpaces > 3) {
double mathresult = (numberOfSpaces / 2);
int numberNBSP = Math.Ceiling(mathresult);
int numberSpace = Math.Floor(mathresult);
string replaceText;
for(numberNBSP > 0, numberNBSP--)
replaceText =" ";
for(numberSpace > 0, numberSpace--)
replaceText =" ";
My issue now is calling this code for each instance of space segments. Each segment needs to be be evaluated individually and I feel like I have a blind spot in RegEx of how to do so. I hope this makes sense, thank you for taking the time to read this!
CodePudding user response:
It's just a matter of passing a callback to Replace
which will execute for every match that is made.
For example:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "I hope this makes sense, thank you for taking the time to read this!";
Console.WriteLine("input: " input);
Regex rx = new Regex(@" ");
string output = rx.Replace(input, Evaluator);
Console.WriteLine("output: " output);
}
static string Evaluator(Match match)
{
string replaceText;
int numberOfSpaces = match.Value.Length;
if (numberOfSpaces > 3) {
double mathresult = (numberOfSpaces / 2);
int numberNBSP = (int) Math.Ceiling(mathresult);
int numberSpace = (int) Math.Floor(mathresult);
replaceText = "";
for (; numberNBSP > 0; numberNBSP--) replaceText = " ";
for (; numberSpace > 0; numberSpace--) replaceText = " ";
} else {
replaceText = match.Value;
}
return replaceText;
}
}
Obviously, the logic of replacing the spaces is your own, and I didn't look into that.
Alternatively, you could use the regex string " {4,}"
which matches 4 or more space characters, then you could get rid of the if (numberOfSpaces > 3)
test etc.
If you want to be able to match all whitespace, such as tabs and newlines, then use \s
rather than a single space character.