Home > Blockchain >  Regex to recognise a specific language element
Regex to recognise a specific language element

Time:01-03

I have to create a specific regex, the thing is whatever I've tried wont work out in the end. I've seen similar posts using the negative lookbehind to solve this, but in my version apparently it is not supported.

The regex I have to create should:

recognise identifiers that may start with underscore, followed MANDATORY by a alphabetical character, followed by one or more alphanumeric or underscore characters. It is essential that the string CANNOT END with underscore.

I have tried this _*[a-zA-Z][a-zA-Z0-9_]*[^_]$ but it won't work for all the cases.

Also this solution with the negative lookbehind creates me issues _*[a-zA-Z][a-zA-Z0-9_]*$(?<!_)

some examples of accepted cases:

  1. a5
  2. _a5___v3
  3. a5_v2_2

and non accepted

  1. 5_v_2
  2. a5_v2_
  3. _5_v_2
  4. _5
  5. a_
  6. a5--v-2

CodePudding user response:

may start with underscore

Use _? not _*

followed MANDATORY by a alphabetical character

[a-zA-Z] looks good depending on what alphabets are acceptable

followed by one or more alphanumeric or underscore characters.

Let's use [a-zA-Z0-9_]* with a * not a because:

It is essential that the string CANNOT END with underscore

Here is the last chunk [a-zA-Z0-9]

Final result: _?[a-zA-Z][a-zA-Z0-9_]*[a-zA-Z0-9]

  • Related