I'm trying to write a regex that matches Arabic and English letters only (numbers and special characters are not allowed) spaces are allowed. This regex worked fine but allows numbers in the middle of the string
/[\u0620-\u064A\040a-zA-Z] $/
for example, it matches (سم111111ر) which suppose not to match. The question is there a way not to match numbers in the middle of the letters.
CodePudding user response:
Note in JavaScript you will have to use the ECMAScript 2018 with Unicode category class support:
const texts = ['أسبوع أسبوع','week week','hunāka','سم111111ر'];
const re = /^(?:(?=[\p{Script=Arabic}A-Za-z])\p{L}|\s) $/u;
for (const text of texts) {
console.log(text, '=>', re.test(text))
}
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
The ^(?:(?=[\p{Script=Arabic}A-Za-z])\p{L}|\s) $
means
^
- start of string(?:
- start of a non-capturing group container:(?=[\p{Script=Arabic}A-Za-z])
- a positive lookahead that requires a char from the Arabic script or an ASCII letter to occur immediately to the right of the current location\p{L}
- any Unicode letter (note\p{Alphabetic}
includes a bit more "letter" chars, you may want to try it out)|
- or\s
- whitespace
)
- repeat one or more times$
- end of string.