I'm trying to split a string ultimately into a 2D array with a semi colon as a delimiter.
var str = "2;poisson
poisson
3; Fromage
6;Monique"
to
var arr = [2, "poisson
poisson"],
[3," Fromage"],
[6,"Monique"]
The array is in the format
[int, string that may start with white space and may end with possible new lines]
The first step would be via regex. However, using (\d \;\s?)(.)
doesn't grab lines with a new line. Regex101.
I'm a little confused as to how to proceed as the newlines/carriage returns are important and I don't want to lose them. My RegEx Fu is weak today.
CodePudding user response:
With Javascript, you could use 2 capture groups:
\b(\d );([^] ?)(?=\n\s*\d ;|$)
The pattern matches:
\b
A word boundary(\d );
Capture group 1, capture 1 digits followed by matching;
(
Capture group 2[^] ?
Match 1 times any character including newlines
)
Close group(?=
Positive lookahead, assert what to the right is\n\s*\d ;|$
Match either a newline followed by optional whitspace chars and the first pattern, or the end of the string
)
Close lookahead
const str = `2;poisson
poisson
3; Fromage
6;Monique`;
const regex = /\b(\d );([^] ?)(?=\n\s*\d ;|$)/g;
console.log(Array.from(str.matchAll(regex), m => [m[1], m[2]]))
CodePudding user response:
Here is a short and sweet solution to get the result with two nested .split()
:
const str = `2;poisson
poisson
3; Fromage
6;Monique`;
let result = str.split(/\n(?! )/).map(line => line.split(/;/));
console.log(JSON.stringify(result));
Output:
[["2","poisson\n poisson"],["3"," Fromage"],["6","Monique"]]
Explanation of the first split regex:
\n
-- newline (possibly change to[\r\n]
to support Windows newlines(?! )
-- negative lookahead for space