Home > database >  How do I match a specific pattern and exclude specific sub-strings with regex
How do I match a specific pattern and exclude specific sub-strings with regex

Time:08-26

How do I match a specific pattern and exclude specific sub-strings with regex case insensitive

I am trying to write Regex for New Zealand address validation. This is the valid character set which I want to capture case insensitive which includes A to Z numbers and letters hyphen "-" and forward slash "/" as well as Maori accented characters for Maori vowels ā, ē, ī, ō, ū and works.

....

var regex = /^[\/A-zĀ-ū0-9\s\,\''\-]*$/;

....

It needs to exclude the following sub-strings case insensitive, with or without spaces to be valid

PO Box

Private Bag

(tricky as both those sub-strings could include spaces or not and be upper or lower case depending on how the user types them)

and the string must start with a number to be valid

e.g.

This is invalid:

Flat 1 311 Point Chevalier Road, Point Chevalier, Auckland 1022, New Zealand

This is valid:

1/311 Point Chevalier Road, Point Chevalier, Auckland 1022, New Zealand

311/1 or 311-1 or 1-311 are all considered valid by NZ Postal Service.

example if this if statement is true considering the regex above then the address string is invalid:

// Allowed character set
var regex = /^[\/A-zĀ-ū0-9\s\,\''\-]*$/;

// Get the address string and convert to lowercase for evaluation pseudocode
var str = getValue().toLowerCase();

// Strip spaces 
str = str.replace(/\s/g, '');

// If the sub string "pobox" or sub string "privatebag" or string doesn't start with a number or doesn't match allowed character set address is invalid
if(str.includes("pobox") || str.includes("privatebag") || (str.match(/^\d/) == null) || (!regex.test(str))){

....

Thanks I really appreciate the input of the community and I know there are Regex gurus out there. I am trying to simplify this so I can use HTML5 form validation rather than a clunky JavaScript evaluation.

CodePudding user response:

One option is to first match the expressions you want to exclude and then regex with the allowed character set.

/^.*(po\s*box|private\s*bag).*$|^\d[\/a-zĀ-ū0-9\s\,\'\-]*$/i

By capturing the excluded patterns, you can check if group 1 has a value. If so, you know that the string should be skipped

var regex = /^.*(po\s*box|private\s*bag).*$|^\d[\/a-zĀ-ū0-9\s\,\'\-]*$/i;
var match = str.match(regex);

if (match && !match[1]) {
   // valid address
} else {
   // invalid address
}

See https://regex101.com/r/AiL6Dy/2

  • Related