Home > Software engineering >  Regex - recognise clusters of numbers and characters but only match a part with numbers
Regex - recognise clusters of numbers and characters but only match a part with numbers

Time:02-28

I want to write a regex for age recognition. It should recognise both simple 'forms' as 98, 1, 25 and more complex like 90-years-old. However, in case of more complex forms, I would like it to only match a number. What I have now, matches both numbers and character-parts:

\b(1[0-4]\d?|\d{0,2}\d{1})(-?years?)?\b

For example, in the string: "18year old 23 year old 99 years old but not 25-year-old and 91year old cousin is 99 now and 90-year-old or 102 year old 505 0 year 1 year 11 year old 11 year 199 102 0-year13 13 14 22 33 45 8 99years", it matches 23, 99 etc. but 18year or 90-year is a single match with two groups.

How can I change it so that only a number in such a cluster is matched (a single group)?

CodePudding user response:

You can use

\b(\d{1,3})(?=-?year|\b)
\b\d{1,3}(?=-?year|\b)

The second one has no capturing groups. See the regex demo.

Details:

  • \b - a word boundary
  • \d{1,3} - one, two or three digits
  • (?=-?year|\b) - a position immediately followed with an optional -, year or a word boundary.
  • Related