I'm trying to find the regex expression I could to parse a line like:
"hello","here, I am","Building "A" and more","Building "B", Indiana"
I expect to find
- "hello"
- "here, I am"
- "Building "A" and more"
- "Building "B", Indiana"
regex (?:^|\,)(\"(?:[^\"]\,)*\"|[^\,]*)
will correctly parse elements with double quotes (such as "Building "A" and more"
) and regex (?:^|\,)(\"(?:[^\"]\,?)*\"|[^\,]*)
will parse elements with comma (such as "here, I am"
) but I have a hard time finding 1 expression that will correctly parse both elements and also the last one which includes a comma and double quotes. Note that an element may contain more than 1 comma and double quote.
I will use this regex in a C# .NET Core 6 application.
CodePudding user response:
What about this regex:
"'(?<v1>. ?)'(?=,')|'(?<v2>. )'"g
Note: I've used single quotes just to be supported inside the Regex101.
Explanation:
'(?<v1>. ?)'(?=,')
matches every quoted character following with a comma and a quote (first priority)- '(?. )' if not followed by a comma, it should be a match too (second priority)