Home > Blockchain >  RegEx that remembers the first separator used and only allow that separator and require the separato
RegEx that remembers the first separator used and only allow that separator and require the separato

Time:07-11

I have a RegEx that matches a Date in the formats documented:

/**
 * RegExp to test for a date in YYYY-MM-DD format (with or without separators)
 * where year has to be 1900-9999, MM = 01-12, DD = 01 - 31
 * YYYYMMDD
 * YYYY-MM-DD
 * YYYY/MM/DD
 * These also match but I'm not thrilled with it
 * YYYY-MM/DD
 * YYYY/MMDD (also matches YYYY-MMDD)
 * YYYYMM-DD  (also matches YYYYMM/DD)
 * @type {RegExp}
 */
const patternYYYYMMDD = /\d{4}[-\/]?(0[1-9]|1[0-2])[-\/]?(0[1-9]|[12][0-9]|3[01])/;

// To test (open browser console and paste)
const p = (msg) => console.log(msg);
p(patternYYYYMMDD.test("20220102"))
p(patternYYYYMMDD.test("2022-01-02"))
p(patternYYYYMMDD.test("2022/01/02"))
p(patternYYYYMMDD.test("2022/0102"))
p(patternYYYYMMDD.test("202201/02"))
p(patternYYYYMMDD.test("2022/01-02"))

My question is: Is it possible to create a RegEx that remembers the first separator used and only allow that separator and to require the separator?

When searching for an answer, this was very helpful. https://stackoverflow.com/a/37563868/3281336

This might be the answer, I just don't understand how to apply the information in this answer. https://stackoverflow.com/a/17949129/3281336

CodePudding user response:

My understanding is that you want each date to have the same delimiter (if any), so '2022-01-02' is valid, but not '2022-01/02'.

Regular expressions on their own aren't going to 'remember' the first delimiter and require subsequent delimiters to match within the same grammar.

Does it matter which delimiter is stored? If it was me I'd want to clean the data to make sure all date strings followed the same format. You could do that for each string like this so that the format is YYYYMMDD (or whatever other format you want). This assumes all data comes in with the right number of characters and no other characters other than you've listed).

'2022-01/02'.replaceAll('-','').replaceAll('/','')
> '20220102'

Then you can move towards ISO 8601 date format like so:

a = '20220102'
new Date(a.slice(0,4),a.slice(5,6),a.slice(7,8))
> 2022-02-01T13:00:00.000Z

CodePudding user response:

The obvious solution would be to create three patterns and 'or' them:

const patternNull = /\d{4}(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])/
const patternDash = /\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])/
const patternSlash = /\d{4}[/](0[1-9]|1[0-2])[/](0[1-9]|[12]\d|3[01])/
const pattern = new RegExp(patternNull.source   '|'   patternDash.source   '|'   patternSlash.source)

Or built it in one go, but do not repeat the year \d{4} at the beginning:

const pattern = /\d{4}((0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])|-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])|[/](0[1-9]|1[0-2])[/](0[1-9]|[12]\d|3[01]))/
  • Related