Home > Blockchain >  Why '❌'[0] === '❌' but '✔️'[0] !== '✔️'?
Why '❌'[0] === '❌' but '✔️'[0] !== '✔️'?

Time:10-13

'❌'[0] === '❌' // true
'✔️'[0] === '✔️' // false
'✔️'[0] === '✔'  // true

I suspect it's unicode related but would like to understand precisely what is happening and how can I correctly compare such charaters. Why is '✔️' treated differently than '❌'?

I encountered it in this simple char counting

'✔️❌✔️❌'.split('').filter(e => e === '❌').length // 2
'✔️❌✔️❌'.split('').filter(e => e === '✔️').length // 0

CodePudding user response:

Because ✔️ takes two characters: "✔️".length === 2

"✔️"[0] === "✔" an "✔️"[1] denotes color, I think.

And "❌".length === 1 so it take only one character.

It's similar to the way emojis with different skin colors work as well.

As to how to compare, I think that "✔️".codePointAt(0) (not to confuse with charCodeAt()) might help. See https://thekevinscott.com/emojis-in-javascript/:

codePointAt and fromCodePoint are new methods introduced in ES2015 that can handle unicode characters whose UTF-16 encoding is greater than 16 bits, which includes emojis. Use these instead of charCodeAt, which doesn’t handle emoji correctly.

CodePudding user response:

I believe the '✔️' is made up of 2 components. When you output '✔️'[0] you get '✔', and the black checkmark does not equal the green checkmark.

However, the '❌' is made up of just a single component, so when you output '❌'[0], you get the same thing: '❌'.

CodePudding user response:

The second char '✔️'[1](code point = 65039) is a Variation Selector

A Variation Selector specifies that the preceding character should be displayed with emoji presentation. Only required if the preceding character defaults to text presentation.

Often used in Emoji ZWJ Sequences, where one or more characters in the sequence have text and emoji presentation, but otherwise default to text (black and white) display.

Examples Snowman as text: ☃. Snowman as Emoji: ☃️

Black Heart as text: ❤. Black Heart as Emoji: ❤️ (not so black)

Variation Selector-16 was approved as part of Unicode 3.2 in 2002.

https://unicode-table.com/en/FE0F/

  • Related