Home > Net >  Regex: Find int and long only (ignore digits in float format)
Regex: Find int and long only (ignore digits in float format)

Time:11-04

I have a regex which can identify most of the int and longs in C#/C /Java... The regex:

((?<=\W)(-?\d [lL]?)(?=\W))

Int and long that it can identified correctly:

int: 1,23,-1,-100,[13]
long: 1L,23l,23L,-1l,-100L,[16L]
if(m_ned==0) m_ned=1;   

But I just find that it identify digital numbers in float and science format as well:

//NOT int and long:
not long: 1ll,1LL,1l0,1L0
not int or long: 3., .4, 5.6,
if(m_ned3==ned2) 
    m_ned=1e-7;  

3,4,5,6,7 in the above text will be identified as int as well. Need something like "\W but not . or e". Anyone helps me out? thanks :)

Here is the sandbox for it:

https://regex101.com/r/vvGJSs/1

CodePudding user response:

For the example data, you could exclude the allowed characters on the left and the right using negative lookarounds, and use a word boundary to prevent a partial word match.

(?<![\w.-])-?\d [lL]?\b(?!\.)
  • (?<![\w.-]) Negative lookbehind, assert not a word character, . or - directly to the left
  • -?\d Match an optional - and 1 digits
  • [lL]? Optionally match l or L
  • \b A word boundary
  • (?!\.) Negative lookahead, assert not a dot to the right

Regex demo

CodePudding user response:

It's usually better to use a negative assertion when defining what's
not allowed. This is because it will mostly match BOS and EOS.
And that's really all the boundary's need to worry.

The stuff not allowed should be specific in a class, and try not to mix predefined
classes that cover many types like \w at least initially until you know all cases needed.

((?<![\da-zA-Z.-])(-?\d [lL]?)(?![\da-zA-Z.-]))

https://regex101.com/r/dMvxBD/1

 (                             # (1 start)
    (?<! [\da-zA-Z.-] )
    ( -? \d  [lL]? )              # (2)
    (?! [\da-zA-Z.-] )
 )                             # (1 end)
  • Related