Home > Software design >  An elegant and efficient way to splitting a string into desired substrings in C#
An elegant and efficient way to splitting a string into desired substrings in C#

Time:07-06

Say I have the following string:

"3 1-sin(90-(70-20.5)*2))"

I'd like to split it into substrings, creating the following list of strings:

{ "3", " ", "1", "-", "sin", "(", "90", "-", "(", "70", "-", "20.5", ")", "*", "2", ")", ")"};

What's a good way to do so? Iterating through the input list and adding each element to the new list? Alternatively, is better using a regular expressions based practice to do so? I would appreciate seeing what such regex pattern should look like, as well as additional methods used for my use case.

Thanks.

CodePudding user response:

This regex fits your sample.

/\d \.\d |\d |\ |\-|\(|\)|\*|sin/gm

On https://regex101.com you can test this and you will get every alternative explained. Have fun tokenizing.

CodePudding user response:

Here is an algorithm I put together using your sample. For each character in the equation, the goal is to determine if that char is part of a number sequence, part of a math function declaration, or a symbol character.

private string[] SplitEquation(string equation)
{
    var mathFuncs = new List<string>() { "sin", "cos" };
    var output = new List<string>();
    var isNumber = false;
    var numberStr = "";
    var thisChar = ' ';
    var maxFuncLength = mathFuncs.Max(func => func.Length);

    string GetMathFunc(int index)
    {
        var substring = "";

        for (int i = 0; i < maxFuncLength; i  )
        {
            substring = equation.Substring(index, i   1);

            if (mathFuncs.Contains(substring))
            {
                return substring;
            }
        }

        return null;
    }
    
    for (int i = 0; i < equation.Length; i  )
    {
        thisChar = equation[i];

        if (char.IsNumber(thisChar) ||
            isNumber && (thisChar == ',' || thisChar == '.'))
        {
            numberStr  = thisChar;
            isNumber = true;
        }
        else
        {
            if (isNumber)
            {
                isNumber = false;

                if (!string.IsNullOrEmpty(numberStr))
                {
                    output.Add(numberStr);
                    numberStr = "";
                }
            }

            if (!char.IsLetterOrDigit(thisChar))
            {
                output.Add(thisChar.ToString());
            }
            else
            {
                var mathFunc = GetMathFunc(i);
                if (mathFunc != null)
                {
                    output.Add(mathFunc);
                    i  = mathFunc.Length;
                }
                else
                {
                    output.Add(thisChar.ToString());
                }
            }
        }
    }

    return output.ToArray();
}

Sample Usage:

var input = "3 1-sin(90-(70-20.5)*2))";
var output = SplitEquation(input);
Console.WriteLine(string.Join(", ", output));

There was no mention about performance, and honestly I do not know how this algorithm would rate (probably low end), however it gets the job done and was fun to write!

I am happy to go over any part of the algorithm.

  • Related