I've been running a soap call, and when I encode an ampersand in lower case, i.e "Jack & Jill"
it works as expected and is received on the other end, but when I send: "Jack & Jill"
, I get a 400 bad request error.
I've looked into it, and all I can see is that the tags are case sensitive, but I haven't seen anything that specifically says that encoding is case sensitive.
Is this the case everywhere, or is there something that can be changed to allow uppercase &
to be accepted.
CodePudding user response:
XML is case-sensitive. Entity names, like element and attribute names, must use the correct case. The entity name &
is defined in the XML specification and cannot be written as &
.
HTML is a different matter.
CodePudding user response:
It depends on parser implementation, here are some examples
Python3 (case sensitive/insensitive)
>>> from xml.sax.saxutils import unescape
>>> unescape('& &')
'& &'
echo -e '& \n&' | python3 -c 'import html,sys; print(html.unescape(sys.stdin.read()), end="")'
&
&
PHP (case sensitive/insensitive)
php -r 'echo html_entity_decode("& \n&") . "\n";'
&
&
php -r 'echo html_entity_decode("& \n&", $flags = ENT_HTML5) . "\n";'
&
&
Xmllint (case sensitive)
echo -e '<div>\n& \n&\n</div>' | xmllint --html --nowrap -
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<div>
&AMP;
&
</div>
</body></html>
echo -e '<div>\n& \n&\n</div>' | xmllint -
-:2: parser error : Entity 'AMP' not defined
&
^
recode (case sensitive)
echo -e '& \n&' | recode html..ascii
&
&
Firefox (XML case sensitive/ HTML case insensitive)