I have to consume a WS that sends its XML data inside a CDATA tag, the output I get is the following:
<parent>
<child1>
<xmltag1>4 años < 8 </xmltag1>
<xmltag2>3 años < 12 </xmltag2>
<child1>
</parent>
I have to format this data to an usable XML so I can work with it.
It should look like:
<parent>
<child1>
<xmltag1>4 años < 8 </xmltag1>
<xmltag2>3 años < 12 </xmltag2>
<child>
</parent>
I have tried various java functions like: StringEscapeUtils.unescapeXml(string);
I guess there could be a way of getting that result by using regex
string.replaceAll("<{0}>", "</{0}>");
CodePudding user response:
You can use
String fixedXml = text.replaceAll("<(/?\\w (?:\\s[^>]*)?>)", "<$1");
See the regex demo. Details:
<
- a<
string(/?\\w (?:\\s[^>]*)?>)
- Group 1 ($1
):/?
- an optional/
char\w
- one or more word chars(?:\s[^>]*)?
- an optional sequence of a whitespace char and then any zero or more chars other than>
>
- a>
char.