I am currently using this regex to isolate 5 digit values not preceded/followed by a number, a dash, and a period. I am trying to figure out a way in addition, to account for - also excluding those that contain a WO or PO with case sensitivity in mind. I just have tried different variations on where to put SO AND PO as a conditions to check against but fail at every turn.
(?<![-0-9.])[0-9]{5}(?![-0-9.])
Current output - Not desired
asdkjflsdf 12345 good
asdfsdf 1234asdfsdf bad
12345 good
.12345. bad
-12345 bad
SO 12345 good
123456 ppp bad
1234 bad
PO12345 good <--
Wo 12345 good <--
Output - Desired
asdkjflsdf 12345 good
asdfsdf 1234asdfsdf bad
12345 good
.12345. bad
-12345 bad
SO 12345 good
123456 ppp bad
1234 bad
PO12345 bad <--
Wo 12345 bad <--
Any help would be greatly appreciated. Thank you
CodePudding user response:
A two-pass approach, as ti7 suggests, may indeed offer the simplest solution:
'asdkjflsdf 12345',
'asdfsdf 1234asdfsdf',
'12345',
'.12345.',
'-12345',
'SO 12345',
'123456 ppp',
'1234',
'PO12345',
'Wo 12345',
'CompanName WO# 12345' |
ForEach-Object {
[pscustomobject] @{
Input = $_
Result = $_ -match '(?<![-0-9.])[0-9]{5}(?![-0-9.])' -and $_ -notmatch '[wp]o'
}
}
Output:
Input Result
----- ------
asdkjflsdf 12345 True
asdfsdf 1234asdfsdf False
12345 True
.12345. False
-12345 False
SO 12345 True
123456 ppp False
1234 False
PO12345 False
Wo 12345 False
CompanName WO# 12345 False
CodePudding user response:
You can probably use a much simpler regex and then have a second round excluding the undesirable collections
Round 1 (exactly 5 digits with a word boundary)
^.*\b\d{5}\b.*$
Round 2 (exclude any unwanted matches)
(?![WwPp][Oo])
CodePudding user response:
In PowerShell, you can use
$s -match '(?<![-0-9.])(?<![pw]o[\W_]*)[0-9]{5}(?![-0-9.])'
See the regex demo.
There are two negative lookbehinds added:
(?<![-0-9.])
- immediately to the left, there should be no ASCII digit,-
or.
chars(?<![pw]o[\W_]*)
- immediately before the current location, there should be noPO
orWO
(case insensitive) substrings (as-match
matches in a case insensitive way) followed with any zero or more non-alphanumeric chars[0-9]{5}
- five ASCII digits(?![-0-9.])
- immediately to the right, there should be no-
,.
or an ASCII digit.