I have a text area where the user can type some text, and after that, with function, i transform it to sentence case. The first letter is capitalized, as is the dot that follows it.
Example
input --> output
hello, one two. three four --> Hello, one two. Three four
But it works only with regular letters from a
to z
. But I also require characters from other languages, such as Danish Æ, Ø, Å
here is functions which wors great with letters from a
to z
function sentenceCase(input, lowercaseBefore) {
input = (input === undefined || input === null) ? '' : input;
if (lowercaseBefore) { input = input.toLowerCase(); }
return input.toString().replace(/(^|\. *)([a-z])/g, function (match, separator, char) {
return separator char.toUpperCase();
});
}
I found that there is something like this to catch another characters also [\p{L}\p{M}]
it works in regex101 (^|\. *)([\p{L}\p{M}])
but when I put it in the code, the function not showing any errors and also not working. Text is not transforming.
replace [a-z]
with [\p{L}\p{M}]
CodePudding user response:
To match any Unicode lowercase letters, you need to use \p{Ll}
Unicode category/property class together with the u
flag:
/(^|\. *)(\p{Ll})/gu
Now, this will match multiple occurrences of
(^|\. *)
- Group 1: start of string or a.
followed with zero or more regular spaces(\p{Ll})
- Group 2: any Unicode lowercase letter.