In my project, we use a RegExp to display the title of cards that we receive from the deck. And recently I found that from the deck side we sometimes receive different formats and the titles didn't display.
So, before it was always a string like this:
const res =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
</p>`;
and the RegExp was:
/<p [^>]*>[\s]*<span[^>]*>(. )<\/span><span[^>]*>(. )<\/span><div[^>]*>(. )<\/div> [\s]*<\/p>/i.exec(res);
Now sometimes we receive res
with div
and <br>
tags inside
const res =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
<div style="font-size: 10px">Title:<br>Some text here</div>
</p>`;
The question is, how to change the RegEx to ignore this <div>..<br>.</div>
?
Here's a demo:
const res =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
</p>`;
const newRes =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
<div style="font-size: 10px">Title:<br>Some text here</div>
</p>`;
const regEx = /<p [^>]*>[\s]*<span[^>]*>(. )<\/span><span[^>]*>(. )<\/span> [\s]*<\/p>/i;
const correct = regEx.exec(res);
const broken = regEx.exec(newRes);
console.log('correct', correct);
console.log('broken', broken);
Would be really grateful for any help!
CodePudding user response:
Parse the htmlString into the DOM, then extract the text.
const res =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
<div style="font-size: 10px">Title:<br>Some text here</div>
</p>`;
const getNodes = str => {
document.body.insertAdjacentHTML('beforeEnd', str);
const DOM = document.querySelector('.cardTitle');
return DOM.innerText;
};
console.log(getNodes(res));
CodePudding user response:
Simplify the regex
/<p [^>]*>\s*<span[^>]*>(.*?)<\/span><span[^>]*>(.*?)<\/span>.*?<\/p>/si
This will get the p
tag, with the 2 spans and whatever else it contains.
const res =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
</p>`;
const newRes =
`<p style="font-size: 27.2px">
<span>Some text here</span><span> - Title</span>
<div style="font-size: 10px">Title:<br>Some text here</div>
</p>`;
const regEx = /<p [^>]*>\s*<span[^>]*>(.*?)<\/span><span[^>]*>(.*?)<\/span>.*?<\/p>/si;
const correct = regEx.exec(res);
const broken = regEx.exec(newRes);
console.log('correct', correct);
console.log('broken', broken);