Home > Net >  Regex expression to get numbers without parentheses ()
Regex expression to get numbers without parentheses ()

Time:10-27

I'm trying to create a regex that will select the numbers/numbers with commas(if easier, can trim commas later) that do not have a parentheses after and not the numbers inside the parentheses should not be selected either.

Used with the JavaScript's String.match method

Example strings

9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4

What i have so far:

/((^\d [^\(])|(,\d ,)|(,*\d $))/gm

I tried this in regex101 and underlined the numbers i would like to match and x on the one that should not.

I tried this in regex101 and underlined the numbers i would like to match and x on the one that should not

CodePudding user response:

You could start with a substitution to remove all the unwanted parts:

/\d*\(.*?\),?//gm

Demo

This leaves you with

5,10
10,2,5,
10,7,2,4

which makes the matching pretty straight forward:

/(\d )/gm

If you want it as a single match expression you could use a negative lookbehind:

/(?<!\([\d,]*)(\d )(?:,|$)/gm

Demo - and here's the same matching expression as a runnable javascript (skeleton code borrowed from Wiktor's answer):

const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/(?<!\([\d,]*)(\d )(?:,|$)/gm), x=>x[1])
console.log(matches);

CodePudding user response:

Here, I'd recommend the so-called "best regex trick ever": just match what you do not need (negative contexts) and then match and capture what you need, and grab the captured items only.

If you want to match integer numbers that are not matched with \d \([^()]*\) pattern (a number followed with a parenthetical substring), you can match this pattern or match and capture the \d , one or more digit matching pattern, and then simply grab Group 1 values from matches:

const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/\d \([^()]*\)|(\d )/g), x=> x[1] ?? "").filter(Boolean)
console.log(matches);

Details:

  • text.matchAll(/\d \([^()]*\)|(\d )/g) - matches one or more digits (\d ) ( (with \() any zero or more chars other than ( and ) (with [^()]*) \) (see \)), or (|) one or more digits captured into Group 1 ((\d ))
  • Array.from(..., x=> x[1] ?? "") - gets Group 1 value, or, if not assigned, just adds an empty string
  • .filter(Boolean) - removes empty strings.

CodePudding user response:

Using several replacement regexes

var textA = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`
console.log('A', textA)

var textB = textA.replace(/\(.*?\),?/g, ';')
console.log('B', textB)

var textC = textB.replace(/^\d |\d $|\d*;\d*/gm, '')
console.log('C', textC)

var textD = textC.replace(/, /g, ' ').trim(',')
console.log('D', textD)

With a loop

Here is a solution which splits the lines on comma and loops over the pieces:

var inside = false;
var result = [];

`9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`.split("\n").map(line => {
  let pieceArray = line.split(",")
  pieceArray.forEach((piece, k) => {
    if (piece.includes('(')) {
      inside = true
    } else if (piece.includes(')')) {
      inside = false
    } else if (!inside && k > 0 && k < pieceArray.length-1 && !pieceArray[k-1].includes(')')) {
      result.push(piece)
    }
  })
})

console.log(result)

It does print the expected result: ["5", "7"]

  • Related