Home > Enterprise >  Regular expression: How to exclude unwanted matches
Regular expression: How to exclude unwanted matches

Time:11-23

I have regular expression that search for rows containing 4 digit numbers, specific 19xx. It gives too many matches so I am looking for a way to exclude the things I dont want.

This is my current regex:

^\s*[^\/].*19\d{2}

Here are some example rows:

short param1 = 1994;
       short param2 = 1918;
// 1998-08-20     
       // 1998-08-20    
      //## begin protected section initialization list [51935568]
//## begin protected section initialization list [51935568]

(Row 2, 4 and 5 have spaces in the beginning.)

My regex manage to correctly:

  • find row 1, 2
  • exclude row 3, 6

But incorrectly also matches row 4 & 5. I cant find a way to make te regex exlude these rows.

CodePudding user response:

Presumably you want to use word boundaries here:

\b19[0-9]{2}\b

The above pattern will only match 19xx years appearing as standalone words.

Demo

CodePudding user response:

If I understand you correctly, you don't want to match numbers that are commented out, they are preceded with //:

^(?!\s*\/\/).*?\b(\d{4})\b

Regex demo.


  • ^ - match beginning of string

  • (?!\s*\/\/) - don't continue matching if // is found at the beginning

  • .*?\b(\d{4})\b - match 4-digit number (with word boundaries)

  • Related