Working on a side-project which aggregates data from various websites, sanitizes the input data, then stores it in postgres.
Currently, I have to implement my own solutions for sanitizing dirty/ugly data, which hasn't been too bad but I've run into an issue with height measurements where there's a mixed bag of quote types, e.g. 5’4″
, 5′ 9″
I'd like to sanitize the strings as follows:
’
,′
and similar characters are replaced with single quotes for feet.″
and similar characters are replaced with double quotes for inches.
Is there a library which solves this problem?
If not, is there a concise regex that provides the same result?
CodePudding user response:
We can use a regex replacement with lookup approach here:
var map = {};
map["’"] = "'";
map["′"] = "'";
map["″"] = "\"";
var input = "5’4″ and 5′ 9″";
var output = input.replace(/[’′″]/g, (x) => map[x]);
console.log(input " => " output);