I know similar questions are available but I could not find this case.
CASE 1: 'a,b,c,d,e'
OUTPUT: ["a", "b", "c", "d", "e"]
CASE 2: 'a,b,"c,d", e'
OUTPUT: ["a", "b", "c,d", "e"]
CASE 3: 'a,,"c,d", e'
OUTPUT: ["a", "", "c,d", "e"]
RegEx that I tried: (".*?"|[^",] )(?=\s*,|\s*$)
RegEx Link: https://regex101.com/r/xImG4i/1
This regex works well with CASE1 and CASE2 But is failing for CASE3. Insead it works for
'a, ,"c,d", e'
, giving output as ["a", " ", "c,d", "e"]
which is also fine but need to work for CASE3 also.
Thanks in advance!
CodePudding user response:
You might take optional whitespace chars between 2 comma's if a lookbehind is supported.
"[^"]*"|[^\s,'"] (?:\s [^\s,'"] )*|(?<=,)\s*(?=,)
const regex = /"[^"]*"|[^\s,'"] (?:\s [^\s,'"] )*|(?<=,)\s*(?=,)/g;
[
`'a,b,c,d,e'`,
`'a,b,"c,d", e'`,
`'a,,"c,d", e'`,
` xz a,, b, c, "d, e, f", g, h`,
`'a, ,"c,d", e'`,
].forEach(s =>
console.log(s.match(regex))
)
If you don't want the double quotes you can use a capture group with matchAll and check for the group in the callback.
const regex = /"([^"]*)"|[^\s,'"] (?:\s [^\s,'"] )*|(?<=,)\s*(?=,)/g;
[
`'a,b,c,d,e'`,
`'a,b,"c,d", e'`,
`'a,,"c,d", e'`,
` xz a,, b, c, "d, e, f", g, h`,
`'a, ,"c,d", e'`,
].forEach(s =>
console.log(Array.from(s.matchAll(regex), m => m[1] ? m[1] : m[0]))
)
CodePudding user response:
An alternate solution that uses a regex for splitting instead of matching:
/,\s*(?=(?:(?:[^"]*"){2})*[^"]*$)/
This regex will split on comma followed by optional spaces if those are outside double quotes by using a lookahead to make sure there are even number of quotes after comma space.
Code Sample:
const re = /,\s*(?=(?:(?:[^"]*"){2})*[^"]*$)/;
[
`a,b,"c,d", e`,
`a,,"c,d", e`,
` xz a,, b, c, "d, e, f", g, h`,
`a, ,"c,d", e`,
].forEach(s => {
tok = s.split(re);
tok.forEach((e, i) => tok[i] = e.replace(/^"|"$/g, ''))
console.log(s, '::', tok);
})