We have a NodeJS based application using express and body-parser. The server is listening for REST requests, in JSON format. We are running into an issue whereby certain characters in the request are being HTML/XML escaped, and we strongly suspect something on the express or body-parse side is doing this, but we can't work out what is doing this.
In our test scenario we make a request with curl:
curl 'http://localhost:6009/stores/1561289s12' \ ajmas@ghostwalker-echo
-X 'PUT' \
-H 'Accept: application/json, text/plain, */*' \
-H 'Content-Type: application/json; charset=utf-8' \
--data-raw $'{"name":"O\'Neil"}' \
--compressed
On the server when we look at the name
value in req.body
, it comes out as:
O'Neil
What leads us to point the finger at express or body-parser is when we check with Wireshark we see the body as we sent it.
We initialised JSON parsing as follows:
app.use(express.json({}));
- Express version is 4.17.1
- Node JS version: 14.15.1
This project also make use of express-validator.
Can anyone suggest anything that could cause this issue?
CodePudding user response:
One thing to pay attention to, is that express-validator does more than validation and can modify or 'sanitise' data. For this reason if anything is being modified between reception and viewing the value in req.body
, then checking the settings of the validators is a good place to start.
In this case the behaviour can be attributed to escape: true
or escape()
, which applies the HTML style entities to the values.
Example of sanitisation, in express validators:
export default [
body('name').trim().escape()
]
export default {
name: {
in: ['body'],
trim: true,
escape: true
}
Removing the 'escape' option or function call would resolve this.