Home > Net >  Sanitizing strings with several types of quotation marks
Sanitizing strings with several types of quotation marks

Time:01-11

Working on a side-project which aggregates data from various websites, sanitizes the input data, then stores it in postgres.

Currently, I have to implement my own solutions for sanitizing dirty/ugly data, which hasn't been too bad but I've run into an issue with height measurements where there's a mixed bag of quote types, e.g. 5’4″, 5′ 9″

I'd like to sanitize the strings as follows:

  • , and similar characters are replaced with single quotes for feet.
  • and similar characters are replaced with double quotes for inches.

Is there a library which solves this problem?
If not, is there a concise regex that provides the same result?

CodePudding user response:

We can use a regex replacement with lookup approach here:

var map = {};
map["’"] = "'";
map["′"] = "'";
map["″"] = "\"";

var input = "5’4″ and 5′ 9″";
var output = input.replace(/[’′″]/g, (x) => map[x]);
console.log(input   " => "   output);

  • Related