Home > Enterprise >  Sentence Case function not recognize all characters
Sentence Case function not recognize all characters

Time:12-17

I have a text area where the user can type some text, and after that, with function, i transform it to sentence case. The first letter is capitalized, as is the dot that follows it.

Example

input --> output

hello, one two. three four --> Hello, one two. Three four

But it works only with regular letters from a to z. But I also require characters from other languages, such as Danish Æ, Ø, Å

here is functions which wors great with letters from a to z

    function sentenceCase(input, lowercaseBefore) {
        input = (input === undefined || input === null) ? '' : input;
        if (lowercaseBefore) { input = input.toLowerCase(); }
        return input.toString().replace(/(^|\. *)([a-z])/g, function (match, separator, char) {
            return separator   char.toUpperCase();
        });
    }

I found that there is something like this to catch another characters also [\p{L}\p{M}] it works in regex101 (^|\. *)([\p{L}\p{M}]) but when I put it in the code, the function not showing any errors and also not working. Text is not transforming.

replace [a-z] with [\p{L}\p{M}]

CodePudding user response:

To match any Unicode lowercase letters, you need to use \p{Ll} Unicode category/property class together with the u flag:

/(^|\. *)(\p{Ll})/gu

Now, this will match multiple occurrences of

  • (^|\. *) - Group 1: start of string or a . followed with zero or more regular spaces
  • (\p{Ll}) - Group 2: any Unicode lowercase letter.
  • Related