Home > Back-end >  How to encode regex pattern for valid json and html5 for form validation?
How to encode regex pattern for valid json and html5 for form validation?

Time:12-17

I work on improving the validation of form configurations that are saved in a python flask app. The form config comes in via an API in json. A user can save a regex pattern for their form field to have additional validation. Since I also wanted to prevent per default the submission of anything like an url in normal text fields (like First name, last name, etc.) I added a marshmallow regex pattern validation for all text fields: ^((?!\:\/\/).)*$

I also wanted a matching regex pattern on the frontend, where the form config is rendered into an actual html form. This is done by a small petite-vue app that provides the html templates for form fields and automatically adds the pattern from the json config file to the html.

I noticed:

  1. My form config saved in the json file will not be valid. The form won't render and I get an error: "Invalid escape character in string.".

Seems the backward slashes \ are a problem for json.

  1. The HTML5 pattern validation in my Firefox is not happy with my regex pattern, and gives me this error: "Unable to check input because the pattern is not a valid regexp: invalid identity escape in regular expression".

The slashes seem to be an issue here as well.

I already found out I can:

  • urlencode my regex pattern.
  • escape the back slash for the json file.

But now I'm stuck on deciding which potential solution is the more robust choice, and which of the two apps is the place to implement it.


So I need help deciding:

  • Should the regex be slash escaped when saving to the form config to provide valid json and the petite-vue app un-escapes and urlencodes the pattern string? That feels like the right places for the task, but also sounds like a potential error source if encoded/decoding of the slashes is not done right.
  • Should I already urlencode any regex pattern when it gets saved in the python flask app? I would need to validate if it already has been urlencoded. Maybe by decoding and comparing it to the encoded version? But I don't mess with the pattern twice.
  • Is there a better solution I'm not thinking of?

CodePudding user response:

I'll answer my own question. I understand now why it is unclear and not reproducable. Hopefully an answer is still valuable.

tldr; The pattern is not right. You don't need to escape the "/" or the ":".

Long version

In the app flow, the pattern gets saved via python in a json file (ie. it uses json_dumps() and already gets encoded). In my test case I just copy/pasted the pattern into the json file, resulting in not valid json. When escaping the \ manually, I either got an error on "invalid identity escape" or the pattern wouldn't match.

  • It was helpful to get a solid mental model of the string and its representative states throughout the app.
  • It was helpful to research more of the different regex flavors.
  • I thought the first comment I found for the error message didn't apply, but it actually did 100%. Would have saved me some hours if I had fully embraced it.
  • Related