Home > front end >  How to unescape escaped characters from the regex (.*?)
How to unescape escaped characters from the regex (.*?)

Time:07-31

I have a markup string for example.

 var text = '<div>\frac{5}{6}</div>'

And i want to get the text in between the div tag with this

var inBetween = text.replace(/<div>(.*?)<\/div>/g,'$1');
console.log(inBetween);

But this outputs rac{5}{6} because the backslash has escaped itself and the letter "f" . Any help on how to undo this.

EDIT:

According to the comments, \f matches a formfeed and is preserved.

CodePudding user response:

Javascript converts escaped characters into special characters, therefore literal \ will be lost. If you need preserve it, either escape the escape character as \\ or convert special characters back into string:

const unchar = ((dict={"\b":"\\b","\f":"\\f","\n":"\\n","\r":"\\r","\t":"\\t","\v":"\\v"})=>text=>text.replace(/[\b\f\n\r\t\v]/g,c=>dict[c]))();

var text = `<div>\frac{5}{6}</div>`;
var inBetween = text.replace(/<div>(.*?)<\/div>/g,'$1');

console.log(text);
console.log(inBetween);
console.log(unchar(text));
console.log(unchar(inBetween));

CodePudding user response:

Don't use regular expression to parse HTML, use an HTML parser to parse HTML. Your browser already has one built in for you to use:

let code = `<div>\\frac{5}{6}</div>`;
let doc = new DOMParser().parseFromString(code, `text/html`)
let content = doc.querySelector(`div`).textContent

But of course, note that your string is missing a \:

  • "\\f" in a string declaration is a slash, and then the letter f
  • "\f" in a string declaration is the FORM FEED control code (\u000c)

If your string came "from somewhere" then make sure to properly escape your content before you start to work with it. For example, if this is user input and you composed it, like:

let text = `<div>${input.value}</div>`;

then: make sure to escape that value before you template it in.

  • Related