I am learning some regular Expressions.
Now am I trying to get the street and the house number out of a string.
So this is my string "Driesstraat(Ide) 20". Driesstraat is the street name. (Ide) is an acronym for the municipality. 20 is the house number.
let re = /^(\d*[\p{L} \d'\/\\\-\.] )[,\s] (\d )\s*([\p{L} \d\-\/'"\(\)]*)$/iu
let match = adresInput.value.match(re)
if (match){
match.shift();
console.log(match.join('|'));
}
The above code works when their is no (Ide). I get this string out of a Belgian Eid reader.
thank you in advance
CodePudding user response:
First things first, (...)
is not matched with your (\d*[\p{L} \d'\/\\\-\.] )
pattern, so you need to add (?:\(\p{L}*\))?
or (?:\([^()]*\))?
right after that pattern to optionally match a part of a string between parentheses. (?:\(\p{L}*\))?
will match only letters between round brackets and (?:\([^()]*\))?
will match any chars other than (
and )
between round brackets.
Besides, when using regular expressions with /u
flag you must make sure you only escape what must be escaped inside character classes. It means, you overescaped several chars inside [...]
. You need not escape (
, )
, .
and you can even avoid escaping -
when if it is put at the end of the character class. The chars that must be escaped inside character classes in ECMAScript regex flavor are \
, ]
, and ^
must be escaped if it is at the start of the character class. Although the -
can also be escaped it is best practice to put it at the end of the character class unescaped.
So, you can use
/^(\d*[\p{L} \d'/\\.-] )(?:\([^()]*\))?[,\s] (\d )\s*([\p{L} \d/'"()-]*)$/u
See the regex demo.
Note you do not even have to escape /
inside character classes in ECMAScript regex flavor in JavaScript.
See a JavaScript test:
let re = /^(\d*[\p{L} \d'/\\.\-] )(?:\([^()]*\))?[,\s] (\d )\s*([\p{L} \d/'"()-]*)$/u;
console.log(re.test("Driesstraat(Ide) 20"));