Here is an example sentence:
क्या आप क्लोज़अप करते हैं
I want to extract the first word क्या
from this sentence using Regex. I can do so in English by using (^\w )
but that doesn't work with other alphabets.
How should I proceed?
CodePudding user response:
You need to add the u
flag for Unicode support:
const str = 'क्या आप क्लोज़अप करते हैं ';
console.log('Letters and punctuation marks: ' str.match(/^[\p{L}\p{M}] /u))
console.log('Anything but space: ' str.match(/^[^\p{Zs}] /u))
Result:
Letters and punctuation marks: क्या
Anything but space: क्या
Explanation:
- both regex use
^
to anchor at the beginning - regex 1:
[\p{L}\p{M}]
- one or more letters and punctuation marks - regex 2:
[^\p{Zs}]
- anything that is not a space (includes all Unicode spaces) - the
u
flag enables Unicode so that you can use\p{...}
Unicode patterns
See details at https://javascript.info/regexp-unicode
CodePudding user response:
You can use this regex to extract first word
^[\pL]
CodePudding user response:
Try following regex
[^\x00-\x7F]