After years of sponging knowledge from the SO community, I'm here to ask my first question.
I'm trying to create a regex expression for a JavaScript web project to find a whole number (positive or negative) with a thousands separator, or just a whole number, in a given string.
Some valid examples:
561,085
3,894,320
-59,099
1,000
1000
-1000
1
-1
0
Invalid examples:
-01
01393
01,300,00
-04,044
I've created this expression so far:
\b(?:[0])|(?:(((-)?[1-9]?\d{2}|\d)((,)?\d{3})|((-)?[1-9] \d*)))\b
(?:[0]) - match a leading 0
|(?:(((-)?[1-9?\d{2}|\d)((,)?\d{3}) - or match a 1-9 starting number with a comma and 3 digits after
|((-)?[1-9] \d*))) - or just match a whole 1-9 starting number without any commas
I think it works in all cases except for when I test a string with 5 digits trailing a minus sign, for example like so:
Value: -01110
In this case it accepts 01110 as a match.
Could anyone help me figure out what I've gotten wrong with the expression?
Also would not mind some general pointers/feedback on how I popped the ask-a-question-cherry here, cheers!
CodePudding user response:
You could use
^(?:0|-?[1-9]\d*|-?[1-9]\d{0,2}(?:,\d{3}) )$
That means:
^ # start of the string
(?: # open par
0 # zero
| # or
-?[1-9]\d* # -100 or 234 or -121323232
| # or
-?[1-9]\d{0,2}(?:,\d{3}) # separated by comma
) # close par
$ # end of the string
To actually use it "in the wild" (that is in any given text), you need to adjust the anchors to lookarounds:
(?<=^|\s)(?:0|-?[1-9]\d*|-?[1-9]\d{0,2}(?:,\d{3}) )(?=\s|$)
# ^^1^^ ^^2^^
^^1^^
means "either the start of the string or some whitespaces in front of the expression", ^^2^^
means "either some whitespaces or the end of the string right after the expression".
See another demo on regex101.com.