Home > Software engineering >  How to separate symbols from text with regex?
How to separate symbols from text with regex?

Time:11-16

I have a textNode in which i have the data coming like this data=$(2000)%,so i want to separate out the symbols and number.

let preText = ''
let number = ''
let postText = ''
let data = '$(2000)%'

const regex = new RegExp(`(\(\[)?[0-9]{1,3}(?:,[0-9]{1,3})*(\.[0-9]{1,5})?(\)\])?`); 

data.match(regex)

Output expected:
preText = '$('
number = '2000'
postText = ')%'

Another example: if data=$2,23,603, the output should be preText=$, number=2,23,603 and postText=.

I'm not able to achieve this with different variables,how can i get the desired output?

CodePudding user response:

You can use

const data = '$(2000)%';
const regex = /^(.*?)((?:\d{1,3}(?:,\d{1,3})*|\d )(?:\.\d{1,5})?)(\D.*)?$/; 
const [_, preText, number, postTex] = data.match(regex);
console.log([preText, number, postTex]);
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Details:

  • ^ - start of string
  • (.*?) - Group 1 (preText): any zero or more chars other than line break chars, as few as possible
  • ((?:\d{1,3}(?:,\d{1,3})*|\d )(?:\.\d{1,5})?) - Group 2 (number): one to three digits followed with zero or more occurrences of a comma and one to three digits, or just one or more digits, and then an optional sequence of a . and one to five digits
  • (\D.*)? - Group 3 (postTex), optional: a non-digit char and then any zero or more chars other than line break chars, as many as possible
  • $ - end of string.

CodePudding user response:

Some notes about the pattern:

  • First testing you code, this is the error message:

Uncaught SyntaxError: Invalid regular expression: /(([)?[0-9]{1,3}(?:,[0-9]{1,3})*(.[0-9]{1,5})?()])?/: Unterminated group

  • Apart from that, you have to double escape the backslash in the RegExp constructor
  • Between the parenthesis, there is 2000 which is 4 digits, and [0-9]{1,3} can only match 1-3 digits.
  • A pattern like \(\[ will only match when both ([ are present
  • The pattern does not match $ or % so that can not be in the output

What you could do it either capture all the possible symbols in groups on the left and the right using a character class like [$\[(]

([$\[(]*)\b((?:[0-9]{1,3}(?:,[0-9]{1,3})*(?:\.[0-9]{1,5})?)|\d )\b([\])%]*)

Regex demo

Or optionally match all non digits \D* around matching the digits:

(\D*)\b((?:[0-9]{1,3}(?:,[0-9]{1,3})*(?:\.[0-9]{1,5})?)|\d )\b(\D*)

Regex demo

If there is a match, the capture group numbers are 1, 2 and 3:

let data = '$(2000)%'
const regex = /([$\[(]*)\b((?:[0-9]{1,3}(?:,[0-9]{1,3})*(?:\.[0-9]{1,5})?)|\d )\b([\])%]*)/;
const m = data.match(regex);
console.log(m);

if (m) {
  preText = m[1];
  // etc..
}
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

  • Related