I'm trying capture 2 groups of numbers, where each group is optional and should only be captured if contains numbers. Here is a list of all valid combinations that it supposed to match:
123(456)
123
(456)
abc(456)
123(efg)
And these are not valid combinations and should not be matched:
abc(efg)
abc
(efg)
However, my regex fails on #4
and #5
combinations even though they contain numbers.
const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"];
const regex = /^(?:(\d ))?(?:\((\d )\))?$/;
list.map((a,i) => console.log(i 1 ". ", a "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
.as-console-wrapper{top:0;max-height:unset!important;overflow:auto!important;}
So, the question is why when used ?
behind a group, it doesn't "skip" that group if nothing matched?
P.S.
With this regex it also captures #4
, but not #5
: /(?:^|(\d )?)(?:\((\d )\))?$/
CodePudding user response:
A solution to what you're looking for can be done with lookahead, see:
(?=^\d (?:\(|$))(\d )|(?=\d \)$)(\d )
Rough translation: a number from the start ending with a bracket (or end of line) OR a number in brackets somewhere in the text
To answer question on optional captured groups
Yes, if a group is marked optional e.g. (A*)?
it does make the whole group optional.
In your case, it is simply a case of the regex not matching - even if the optional part isn't there (verify with the help of a regex debugger)
CodePudding user response:
@WiktorStribiżew and @akash had good ideas, but they are based on global flag, which requires additional loop to gather all the matches.
For now, I come up with this regex, which matches anything, but it captures only what I need.
const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"];
const regex = /(?:(\d )|^|[^(] ) ?(?:\((?:(\d )|\D*)\)|$) ?/;
list.map((a,i) => console.log(i 1 ". ", a "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
.as-console-wrapper{top:0;max-height:unset!important;overflow:auto!important;}
CodePudding user response:
Here an idea without global flag and supposed to only match the needed items:
^(?=\D*\d)(\d )?\D*(?:\((\d*)\))?\D*$
^(?=\D*\d)
The lookahead at^
start checks for at least a digit(\d )?
capturing the digits to the optional first group\D*
followed by any amount of non digits(?:\((\d*)\))?
digits in parentheses to optional second group\D*$
matching any amount of\D
non digits up to the$
end
See your JS demo or a demo at regex101 (the [^\d\n]
only for multiline demo)