need to match the ssn numbers only if both delimiters should match. But below code matches all.
((?:\d[-.\s]*?){9})
Input:
list of ssn are 222-33-4444, 333.77-8888 and 111 77.9998 and 111 22 3333 and 11-222222-9
Expected output:
222-33-4444
111 22 3333
11-222222-9
CodePudding user response:
You can capture the first delimiter and then use a back-reference to assert that the second delimiter is the same character. Since the format can be variable in terms of delimiter placement, you also need to assert that there are 9 digits and 2 delimiters:
\b(?=[\d. -]{11}\b)\d{1,}([. -])\d{1,}\1\d{1,}\b
If the SSN may be adjacent to word characters, \b
will not work (as there is no word boundary between a digit and a word character) and you will need to use negative lookarounds to assert the SSN is not preceded or followed by other digits:
(?<!\d)(?=[\d. -]{11}(?!\d))\d{1,}([. -])\d{1,}\1\d{1,}(?!\d)