Home > database >  Split string and keep new lines
Split string and keep new lines

Time:11-26

I'm trying to split a string ultimately into a 2D array with a semi colon as a delimiter.

var str = "2;poisson
            poisson
           3; Fromage
           6;Monique"

to

var arr = [2, "poisson
               poisson"],
          [3," Fromage"],
          [6,"Monique"]

The array is in the format

[int, string that may start with white space and may end with possible new lines]

The first step would be via regex. However, using (\d \;\s?)(.) doesn't grab lines with a new line. Regex101.

I'm a little confused as to how to proceed as the newlines/carriage returns are important and I don't want to lose them. My RegEx Fu is weak today.

CodePudding user response:

With Javascript, you could use 2 capture groups:

\b(\d );([^] ?)(?=\n\s*\d ;|$)

The pattern matches:

  • \b A word boundary
  • (\d ); Capture group 1, capture 1 digits followed by matching ;
  • ( Capture group 2
    • [^] ? Match 1 times any character including newlines
  • ) Close group
  • (?= Positive lookahead, assert what to the right is
    • \n\s*\d ;|$ Match either a newline followed by optional whitspace chars and the first pattern, or the end of the string
  • ) Close lookahead

Regex demo

const str = `2;poisson
            poisson
           3; Fromage
           6;Monique`;


const regex = /\b(\d );([^] ?)(?=\n\s*\d ;|$)/g;
console.log(Array.from(str.matchAll(regex), m => [m[1], m[2]]))

CodePudding user response:

Here is a short and sweet solution to get the result with two nested .split():

const str = `2;poisson
    poisson
3; Fromage
6;Monique`;
let result = str.split(/\n(?! )/).map(line => line.split(/;/));
console.log(JSON.stringify(result));

Output:

[["2","poisson\n    poisson"],["3"," Fromage"],["6","Monique"]]

Explanation of the first split regex:

  • \n -- newline (possibly change to [\r\n] to support Windows newlines
  • (?! ) -- negative lookahead for space
  • Related