Home > Net >  ‌ changed to   by javascript or DOM?
‌ changed to   by javascript or DOM?

Time:10-17

I am creating web based application where I let user prepare HTML email. Part of the html email is also PREHADER. To display preheader nice we use enter image description here

Could someone explain why inspector shows correct string but string exported from DOM is different? Could you suggest how to solve this?

https://i.imgur.com/mNqqan4.png

CodePudding user response:

The character represented by ‌ is not replaced by  . What you see happening is that this character is not rendered as an HTML entity, but really with the character itself (which is a zero-width space, so not visible). The   you see is just the character that follows that ‌.

It becomes easier to spot when you reduce the HTML and text to a bare minimum:

var html = document.querySelector("div").innerHTML;
console.log("full HTML content:", html);
console.log("length:", html.length); // 8
console.log("is first char an ampersand?:", html[0] === "&");
console.log("is last char a semi-colon?:", html[7] === ";");
console.log("character code of first character:", html.charCodeAt(0));
console.log("character code of last character:", html.charCodeAt(7));

var text = document.querySelector("div").textContent;
console.log("as plain text, length is:", text.length); // 3

console.log("encoded back:", html.replace(/\u200c/g, "‌"));
<div>&zwnj;&nbsp;&zwnj;</div>

Take note of the length that is reported: 8 (not 6 for just &nbsp;), and the character codes that precede and follow &nbsp;. I hope this explains it.

  • Related