var str = '&<![CDATA[&]]><![CDATA[&]]>&';
In the above string, I just want to convert only &
inside the CDATA not the all &
.
Expected Output: &<![CDATA[&]]><![CDATA[&]]>&
I tried below regular expression
str.trim().replace(/^(\/\/\s*)?<!\[CDATA\[|(\/&\/\s*)?\]\]>$/g, '&')
;
But above code is not working as expected. I am not good in regular expressions. I gone through different answers given in Stackoverflow. But, not able to find the better way to achieve the fix. Could you please guide me.
CodePudding user response:
For this particular string you can apply /(?<=CDATA\[)[&a-z;] (?=]])/g
You can use positive lookbehind and lookahead:
(?<=CDATA\[)
is a positive lookbehind. Searches everything after CDATA[(?=]])
is a positive lookahead. Searches everything before ]][&a-z;]
matches some text containing lowercase letters, & and ;
If I've got your idea correctly, it would be better to use XML parsers to manipulate a document.
Here you can find a sample js code.
CodePudding user response:
If you want to replace any &
in CDATA, regardless of what comes before and after (within CDATA):
str.trim().replace(/<!\[CDATA\[.*?\]\]>/g, m => m.replace('&', '&'));
results in
"&<![CDATA[&]]><![CDATA[&]]>&"
This first matches CDATA sections and replaces them with the result of a function, the function replaces all &
with &
;
Because that function is only applied on CDATA sections, &
s outside of CDATA will not be changed.
Example with more characters in CDATA:
var str = '&<![CDATA[Oh look at this: & Haha!]]>&';
str.trim().replace(/<!\[CDATA\[.*?\]\]>/g, m => m.replace('&', '&'));
result:
"&<![CDATA[Oh look at this: & Haha!]]>&
CodePudding user response:
If you have control over the data received it is better to fix the data upstream. If not, you can use nested replaces:
- outer replace identifies the
<![CDATA[...]]>
- inner replace
&
inside CDATA
Both use the g
flag to replace multiple time.
[
'&<![CDATA[&]]><![CDATA[&]]>&',
'&<![CDATA[this & that]]>&'
].forEach(str => {
let result = str.replace(/<!\[CDATA\[[^\]]*\]\]>/, m => m.replace(/&/g, '&'));
console.log(result);
});
Output:
&<![CDATA[&]]><![CDATA[&]]>&
&<![CDATA[this & that]]>&