Home > database >  Split UTF16 sentence in JavaScript
Split UTF16 sentence in JavaScript

Time:10-30

I have two sentences divided by the new line.

I assume the regex for it is: /\r?\n/

But the sentence is in UTF16 encoding and the sentence is coming from an external program (I cannot change it).

For simplicity, I will insert here in the variable sentence with \n, but please correct me if that is not the case in UTF16.

A good reference is https://codepoints.net/U 000A line feed in Unicode.

I was trying to split it into parts into an array with the following code:

const body = '뼨莯Ƕ䥆⤄奼ꅺ쏁ష\nǷƙ析輽䤷_瀧繡媓'
const parts = body.split(/\r?\n/)

As a result, I am getting an empty array.

How would you split the UTF16 sentences divided by a new line delimiter in JavaScript?

CodePudding user response:

It's not necessary to use a regex, simply split by new line?

The \r is a Windows new line thing, but since both \n and \r\n does contain a \n it works fine.

const body = '뼨莯Ƕ䥆⤄奼ꅺ쏁ష\nǷƙ析輽䤷_瀧繡媓'
const parts = body.split("\n")

console.log(parts)

  • Related