I need to split a string on C#, based on space as delimiter and preserving the quotes.. this part is ok.
But additionally, I want to allow escape character for string \"
to allow include other quotes inside the quotes.
Example of what I need:
One Two "Three Four" "Five \"Six\""
To:
- One
- Two
- Three Four
- Five "Six"
This is the regex I am currently using, it is working for all the cases except "Five \"Six\""
//Split on spaces unless in quotes
List<string> matches = Regex.Matches(input, @"[\""]. ?[\""]|[^ ] ")
.Cast<Match>()
.Select(x => x.Value.Trim('"'))
.ToList();
I'm looking for any Regex, that would do the trick.
CodePudding user response:
You can use
var input = "One Two \"Three Four\" \"Five \\\"Six\\\"\"";
// Console.WriteLine(input); // => One Two "Three Four" "Five \"Six\""
List<string> matches = Regex.Matches(input, @"(?s)""(?<r>[^""\\]*(?:\\.[^""\\]*)*)""|(?<r>\S )")
.Cast<Match>()
.Select(x => Regex.Replace(x.Groups["r"].Value, @"\\(.)", "$1"))
.ToList();
foreach (var s in matches)
Console.WriteLine(s);
See the C# demo.
The result is
One
Two
Three Four
Five "Six"
The (?s)"(?<r>[^"\\]*(?:\\.[^"\\]*)*)"|(?<r>\S )
regex matches
(?s)
- aRegexOptions.Singleline
equivalent to make.
match newlines, too"(?<r>[^"\\]*(?:\\.[^"\\]*)*)"
-"
, then Group "r" capturing any zero or more chars other than"
and\
and then zero or more sequences of any escaped char and zero or more chars other than"
and\
, and then a"
is matched|
- or(?<r>\S )
- Group "r": one or more whitespaces.
The .Select(x => Regex.Replace(x.Groups["r"].Value, @"\\(.)", "$1"))
takes the Group "r" value and unescapes (deletes a \
before) all escaped chars.