Home > Software design >  Remove HTML tags and its contents from a string - Javascript
Remove HTML tags and its contents from a string - Javascript

Time:11-21

Suppose I have the following string: const test = "This is outside the HTML tag. <title>How to remove an HTML element using JavaScript ?</title>";

I'd like to remove the content within all HTML tags in that string. I have tried doing test.replace(/(<([^>] )>)/gi, ''), but this only removes the HTML tags rather than all the content within it as well. I would expect the outcome to only be 'This is outside the HTML tag.'.

Is it possible to remove HTML tags and its contents within a string?

CodePudding user response:

Rather than trying to remove the HTML element via Regex, it's much more straightforward to create and populate a DOM Fragment using:

let myDiv = document.createElement('div');
myDiv.innerHTML = test;

and then remove the <title> element from that, using:

myDivTitle = myDiv.querySelector('title');
myDiv.removeChild(myDivTitle);

Working Example (One Element):

const test = "This is outside the HTML tag. <title>How to remove an HTML element using JavaScript ?</title>";

let myDiv = document.createElement('div');
myDiv.innerHTML = test;
myDivTitle = myDiv.querySelector('title');
myDiv.removeChild(myDivTitle);
const testAfter = myDiv.innerHTML;
console.log(testAfter);


The above works for one element (<title>) but you stated:

I'd like to remove the content within all HTML tags in that string

so let's try something more ambitious, using:

myDiv.querySelectorAll('*')

Working Example (All Elements):

const test = "<title>How to remove an HTML element using JavaScript ?</title> This is outside the HTML tag. <h1>Here we go...</h1> So is this. <p>This is going to save a lot of time trying to come up with regex patterns</p> This too.";

let myDiv = document.createElement('div');
myDiv.innerHTML = test;
myDivElements = myDiv.querySelectorAll('*');

for (myDivElement of myDivElements) {
  myDiv.removeChild(myDivElement);
}

const testAfter = myDiv.innerHTML;
console.log(testAfter);

CodePudding user response:

You should try like this :

var html = "<p>Hello, <b>Frields</b>";
var div = document.createElement("div");
div.innerHTML = html;
alert(div.innerText); // Hello, Frields

CodePudding user response:

You can replace everything between the two elements by putting a Wildcard character between two of your regex

const test = "This is outside the HTML tag. <title>How to remove an HTML element using JavaScript ?</title>";

console.log(test.replace(/(<([^>] )>).*(<([^>] )>)/, ''))

  • Related