Home > Blockchain >  Regex matches numbers in date but shouldn't
Regex matches numbers in date but shouldn't

Time:10-28

Why does my regex pattern match the date part of the string? It seems like I'm not accounting for the / (slash) correctly with [^\/] to avoid the pattern to match date strings?

const reg = new RegExp(
  /(USD|\$|EUR|€|USDC|USDT)?\s?(\d [^\/]|\d{1,3}(,\d{3})*)(\.\d )?(k|K|m|M)?\b/,
  "i"
);

const str = "02/22/2021 $50k";

console.log(reg.exec(str));

// result: ['02', undefined, '02', undefined, undefined, undefined, undefined, index: 0, input: '02/22/2021 $50k', groups: undefined]

// was expecting: [$50k,...]
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

CodePudding user response:

You get those matches for the date part and the undefined ones, because you use a pattern with optional parts and alternations |

In your pattern there is this part (\d [^\/]|\d{1,3}(,\d{3})*). That first part of the alternation \d [^\/] matches 1 digits followed by any char except a / (which can also match a digit) and the minimum amount of characters is 2. That part will match 20, 22 and 2021 in the date part.

If there is 1 digit, the second part of the alternation will match it.


If you want to match only numbers as well, you can assert not / to the left and the right, and make the whole part with the first alternatives like USD optional with the optional whitspace chars as well, to prevent matching that before only digits.

The last alternation can be shortened to a character class [km]? with a case insensitive flag.

See this page for the lookbehind support for Javascript.

(?:(?:USD|\$|EUR|€|USDC|USDT)\s?)?(?<!\/)\b(?:\d{1,3}(?:,\d{3})*(?:\.\d )?|\d )(?!\/)[KkMm]?\b

Regex demo

const reg = /(?:(?:USD|\$|EUR|€|USDC|USDT)\s?)?(?<!\/)\b(?:\d{1,3}(?:,\d{3})*(?:\.\d )?|\d )(?!\/)[KkMm]?\b/gi;
const str = "02/22/2021 $50k 1,213.3 11111111 $50,000 $50000"
const res = Array.from(str.matchAll(reg), m => m[0]);
console.log(res)
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

If the currency is not optional:

(?:USD|\$|EUR|€|USDC|USDT)\s?(?:\d{1,3}(?:,\d{3})*(?:\.\d )?|\d )[KkMm]?\b

Regex demo

CodePudding user response:

I can't get your regex well. so i try to figure out what result you would expect. check this. in groups you have each part of your string.

const regex = /(\d{2})*\/?(\d{2})\/(\d{2,4})?\s*(USD|\$|EUR|€|USDC|USDT)?(\d*)(k|K|m|M)?\b/i
const regexNamed = /(?<day>\d{2})*\/?(?<month>\d{2})\/(?<year>\d{2,4})?\s*(?<currency>USD|\$|EUR|€|USDC|USDT)?(?<value>\d*)(?<unit>k|K|m|M)?\b/i

const str1 = '02/22/2021 $50k'
const str2 = '02/2021 €50m'

const m1 = str1.match(regex)
const m2 = str2.match(regexNamed)
console.log(m1)
console.log(m2.groups)
<iframe name="sif3" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Blockquote

  • Related