Home > OS >  What regular Expression matches the name of a variable or name of a function while allowing the left
What regular Expression matches the name of a variable or name of a function while allowing the left

Time:01-23

The regular expression (regex) shown below is an example of something that would match a variable-name or function-name in an old language such as C .

[a-zA-Z_] [a-zA-Z_0-9]*

Partially translated, we have:

  • [a-zA-Z_] means one or more of [a-zA-Z_]
  • [a-zA-Z_0-9]* means zero or more of [a-zA-Z_0-9]

When the regular expression shown above is translated into mathematical English, we have:

One or more letters and underscores on the left, followed by zero or more symbols taken from of the set of all letters A through Z, underscores, and numbers zero through nine.

My question is, What regex would match names for variables, allowing numbers on the left, but not matching literal numbers, such as 6.4222

It would be ideal if the regex would not match any of the following integer literals:

INTEGER REMARK
42 The answer to the ultimate question of life, the universe and everything
052 Octal (base 8)
0x2a Hexidecimal (base 16) with a lower-case letter a
0X2A Hexidecimal (base 16) with an upper-case letter A
0b101010 Binary

The regular expression (regex) should match all of the following strings:

0orange
1kiwi  
8apple
main
get_user_input
_
_callable

CodePudding user response:

The following regular expression will match both of the following:

  • variable names in the C programming language
  • variable names in C with a string of one or more numerals (0 or 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9) inserted at the left of the variable name.
\b[0-9]*(?:(?!x)[a-zA-Z_]) [a-zA-Z_0-9]*\b
String is_match
052 false
789321 false
25.122 false
0x2a false
INTEGER true
REMARK true
0X2A true
0b101010 true
0orange true
8apple true
0orange true
1kiwi true
8apple true
main true
get_user_input true
_ true
_callable true

The regular expression does not match integer (int) or floating-point (float) constants.

You can try \b[0-9]*(?:(?!x)[a-zA-Z_]) [a-zA-Z_0-9]*\b at the following website regex101.com

CodePudding user response:

[a-zA-Z_0-9]*[a-zA-Z_][a-zA-Z_0-9]*

[a-zA-Z_] in the middle requires at least one non-digit character somewhere in the token.

  • Related