Home > Mobile >  How to split string in C# using comma when string also contain comma
How to split string in C# using comma when string also contain comma

Time:01-23

I have following string:

1, 20045, abc, "new york, some2", new york, your name

How do I split this string using comma when it also contains comma in one of the values?

CodePudding user response:

As the comments by @jmcilhinney mention, you should ideally be using a CSV parser here. If you want to go the splitting approach, I would suggest a regex find all on the following pattern:

".*?"|[^\s,] (?: [^\s,] )*

This pattern says to match:

  • ".*?" first try to consume a doubly quoted term, possibly containing commas
  • | OR
  • [^\s,] match a term not including comma
  • (?: [^\s,] )* possibly followed by space and another term, 0 or more times

This regex trick eagerly matches doubly quoted terms, and only that failing will use comma as a separator.

Sample script:

string text = "1, 20045, abc, \"new york, some2\", new york, your name";
string search = @""".*?""|[^\s,] (?: [^\s,] )*";
MatchCollection matches = Regex.Matches(text, search);
foreach (Match match in matches)
{
    GroupCollection groups = match.Groups;
    Console.WriteLine(groups[0].Value);
}

This prints:

1
20045
abc
"new york, some2"
new york
your name

CodePudding user response:

It is likely best to pick some library that can handle CSV files.

Otherwise, this could work in cases like yours:

public static string[] Split(string str)
{
    var indices = new List<int>();
    var insideQuote = false;
    for (var i = 0; i < str.Length;   i)
    {
        switch (str[i])
        {
            case '"':
                insideQuote ^= true;
                break;
            case ',':
                if (!insideQuote) { indices.Add(i); }
                break;
        }
    }
    if (indices.Count == 0)
    {
        return new[] { str, };
    }

    var arr = new string[indices.Count   1];
    arr[0] = str.Substring(0, indices[0]);
    for (var i = 1; i < arr.Length - 1;   i)
    {
        arr[i] = str.Substring(indices[i - 1]   1, indices[i] - indices[i - 1] - 1);
    }
    arr[arr.Length - 1] = str.Substring(indices[arr.Length - 2]   1);

    return arr;
}
  • Related