Home > other >  Express is applying HTML entities to JSON bodies
Express is applying HTML entities to JSON bodies

Time:11-02

We have a NodeJS based application using express and body-parser. The server is listening for REST requests, in JSON format. We are running into an issue whereby certain characters in the request are being HTML/XML escaped, and we strongly suspect something on the express or body-parse side is doing this, but we can't work out what is doing this.

In our test scenario we make a request with curl:

curl 'http://localhost:6009/stores/1561289s12' \                                                           ajmas@ghostwalker-echo
  -X 'PUT' \
  -H 'Accept: application/json, text/plain, */*' \
  -H 'Content-Type: application/json; charset=utf-8' \
  --data-raw $'{"name":"O\'Neil"}' \
  --compressed

On the server when we look at the name value in req.body, it comes out as:

O'Neil

What leads us to point the finger at express or body-parser is when we check with Wireshark we see the body as we sent it.

We initialised JSON parsing as follows:

  app.use(express.json({}));
  • Express version is 4.17.1
  • Node JS version: 14.15.1

This project also make use of express-validator.

Can anyone suggest anything that could cause this issue?

CodePudding user response:

One thing to pay attention to, is that express-validator does more than validation and can modify or 'sanitise' data. For this reason if anything is being modified between reception and viewing the value in req.body, then checking the settings of the validators is a good place to start.

In this case the behaviour can be attributed to escape: true or escape(), which applies the HTML style entities to the values.

Example of sanitisation, in express validators:

export default [
  body('name').trim().escape()
]
export default {
    name: {
        in: ['body'],
        trim: true,
        escape: true
}

Removing the 'escape' option or function call would resolve this.

  • Related