I'm trying to create a regex pattern to match account ids following certain rules. This matching will occur within a python script using the re library, but I believe the question is mostly just a regex in general issue.
The account ids adhere to the following rules:
- Must be exactly 6 characters long
- The letters and numbers do not have to be unique
AND
- 3 uppercase letters followed by 3 numbers
OR
- Up to 6 numbers followed by an amount of letters that bring the length of the id to 6
So, the following would be 'valid' account ids:
ABC123
123456
12345A
1234AB
123ABC
12ABCD
1ABCDE
AAA111
And the following would be 'invalid' account ids
ABCDEF
ABCDE1
ABCD12
AB1234
A12345
ABCDEFG
1234567
1
12
123
1234
12345
I can match the 3 letters followed by 3 numbers very simply, but I'm having trouble understanding how to write a regex to varyingly match an amount of letters such that if x = number of numbers in string, then y = number of letters in string = 6 - x.
I suspect that using lookaheads might help solve this problem, but I'm still new to regex and don't have an amazing grasp on how to use them correctly.
I have the following regex right now, which uses positive lookaheads to check if the string starts with a number or letter, and applies different matching rules accordingly:
((?=^[0-9])[0-9]{1,6}[A-Z]{0,5}$)|((?=^[A-Z])[A-Z]{3}[0-9]{3}$)
This works to match the 'valid' account ids listed above, however it also matches the following strings which should be invalid:
- 1
- 12
- 123
- 1234
- 12345
How can I change the first capturing group ((?=^[0-9])[0-9]{1,6}[A-Z]{0,5}$)
to know how many letters to match based on how many numbers begin the string, if that's possible?
CodePudding user response:
I am unsure how to modify your regex to ensure that the overall username length is 6 characters. However, it would be extremely easy to check that in python.
import re
def check_username(name):
if len(name) == 6:
if re.search("((?=^[0-9])[0-9]{1,6}[A-Z]{0,5}$)|((?=^[A-Z])[A-Z]{3}[0-9]{3}$)", name) != None:
return True
return False
Hopefully this is helpful to you!
CodePudding user response:
You could write the pattern as:
^(?=[A-Z\d]{6}$)(?:[A-Z]{3}\d{3}|\d [A-Z]*)$
Explanation
^
Start of string(?=[A-Z\d]{6}$)
Positive lookahead, assert 6 chars A-Z or digits till the end of the string(?:
Non capture group for the alternatives[A-Z]{3}\d{3}
Match 3 chars A-Z and 3 digits|
Or\d [A-Z]*
Match 1 digits and optional chars A-Z
)
Close the non capture group$
End of string