I have a regex that looks like this:
/(((\ |00)32[ ]?(?:\(0\)[ ]?)?)|0){1}(4(60|[789]\d)\/?(\s?\d{2}\.?){2}(\s?\d{2})|(\d\/?\s?\d{3}|\d{2}\/?\s?\d{2})(\.?\s?\d{2}){2})/g
this matches: 32 16/894477
but 32 16-894477
doesn't
this 20150211-0001731015-1
also matches but this shouldn't match
I am trying to fix my regex here:
https://regex101.com/r/LmaIPA/1
CodePudding user response:
(((\ |00)32[ ]?(?:\(0\)[ ]?)?)|0){1}(4(60|[789]\d)\/?(\s?\d{2}\.?){2}(\s?\d{2})|(\d\/?\s?\d{3}|\d{2}(\/?|\-)\s?\d{2})(\.?\s?\d{2}){2})
I guess I fixed part of it by adding this but let me know if there something else that doesn't work properly :)
CodePudding user response:
There are a lot of capture groups, and some can also be omitted if you don't need them for after processing.
The issue is that for 32 16-894477
you are not matching the hyphen, and you match the larger string as there are no boundaries set so you get a partial match.
Some notes:
- You don't have to escape the
/
when using a different delimiter - You can omit
{1}
from the pattern \s
can also match a newline, you can use\h
if you want to match a horizontal whitespace char- A single space
[ ]
does not have to be in a character class
You can extend the pattern with adding the hyphen and forward slash to a character class using [/-]?
, wrap the whole pattern in a non capture group and assert a whitspace boundary to the right (?:whole pattern here)(?!\S)
A version without the capture groups for a match only:
(?:(?:(?:\ |00)32\h?(?:\(0\)\h?)?|0)(?:4(?:60|[789]\d)/?(?:\h?\d{2}\.?){2}\h?\d{2}|(?:\d/?\h?\d{3}|\d{2}[/-]?\h?\d{2})(?:\.?\h?\d{2}){2}))(?!\S)
Php example
$re = '~(?:(?:(?:\ |00)32\h?(?:\(0\)\h?)?|0)(?:4(?:60|[789]\d)/?(?:\h?\d{2}\.?){2}\h?\d{2}|(?:\d/?\h?\d{3}|\d{2}[/-]?\h?\d{2})(?:\.?\h?\d{2}){2}))(?!\S)~';
$str = 'OK 01/07 - 31/07
OK 0487207339
OK 32487207339
OK 01.07.2016
OK 32 (0)16 89 44 77
OK 016894477
OK 003216894477
OK 3216894477
OK 016/89.44.77
OK 32 16894477
OK 0032 16894477
OK 32 16/894477
NOK 32 16-894477 (this should match)
OK 0479/878810
NOK 20150211-0001731015-1 (this shouldn\'t match)';
preg_match_all($re, $str, $matches);
print_r($matches[0]);
Output
Array
(
[0] => 0487207339
[1] => 32487207339
[2] => 32 (0)16 89 44 77
[3] => 016894477
[4] => 003216894477
[5] => 3216894477
[6] => 016/89.44.77
[7] => 32 16894477
[8] => 0032 16894477
[9] => 32 16/894477
[10] => 32 16-894477
[11] => 0479/878810
)