Home > database >  Match version numbers using regex while excluding entries with underscore
Match version numbers using regex while excluding entries with underscore

Time:04-16

I'm trying to match release versions using regex but exclude any string or group in a string that might have underscore or letter after version number. The pattern is n.n(optionally additional .n) and nothing else, aka major.minor version and additionally build version, nothing else should be included.

For example:

1.2.3
10.2.4
10.20.5
10.20.323
1.20.30
1.2.33
1.0
1.2a
1.1.2a
1.22_UAT_2
1.10.2_TEST1
1.2_UAT2
2.0_LIVE_2_
a line with text which has 1.2 or then 1.2.1 but then 1.2.1_UAT2 or 1.3_TEST also exclude such version as 1.2a or 1.2.1b

Should return:

1.2.3
10.2.4
10.20.5
10.20.323
1.20.30
1.2.33
1.0
1.2
1.2.1

Best I got to is using ((\d )\.)[^_]?((\d )\.)[^_]?(\d )[^_] but it doesn't include 1.0 or 1.2

CodePudding user response:

Using ^([\d\.]*\n):

( ) marks a group

^ is used to search at the start of a Line

[\d\.]* matches any digit or ., * matches any amount of those

\n matches newline (the end of a line)

In other words: The expression matches any combination of digits or . that ranges from the start of the line to the end of the line

However i am confused why your "Should return" block contains 1.2.1 in the last line, could you explain why 1.2.1 should be matched?

CodePudding user response:

You can match every digit and dot [\d\.] [A-Za-z]*, which group is found

  • after a boundary or a space (\b|\s)
  • before an end of line character or a space (\s|$)

Here's the complete regex:

(\b|\s)([\d\.] )(\s|$)

If you want a more strict version, that avoids matching versions that have less than two numbers and more than three, you can instead use the following:

(\b|\s)([\d] \.[\d] ([\.] [\d] )?)(\s|$)

In order to retrieve the version, you can access Group 2 for both regex.

Does it work for your cases?

CodePudding user response:

You can attempt to match the regular expression with the case-indifferent flag set.

^\d (?:\.\d ) (?![_\d.]|[a-z][a-z\d]*_)

Demo

The expression contains the following elements.

^           # match beginning of string
\d          # match >= 1 digits
(?:         # begin non-capture group
  \.\d      # match a period followed by >= 1 digits
)           # end non-capture group and execute it >= 1 times
(?!         # begin negative lookahead
  [_\d.]    # match an underscore, digit or period
|           # or
  [a-z]     # match a letter
  [a-z\d]*  # match >= 0 letters or digits
  _         # match underscore
)
  • Related