How to write a python regular expression that ensures that a character isn't in a specific posi-CodePudding

I am attempting to write a regular expression that validates usernames, and one of the requirements is that if there are numbers in the string then the numbers must be after the letters and the first number may not be zero. for some reason I cannot get it to invalidate strings that contain a set of numbers that start with zero

username_pattern = r'^[a-z]{3}[a-z]*([1-9]*|[1-9]*[0-9]*)'
if not re.fullmatch(username_pattern, test_username):
    raise ValueError('username may have numbers but they must be after all the letters and may not being with 0')

I expected something like "abc123" to work but "abc0123" to fail and they both work.

CodePudding user response：

You can use r'^[a-z]{3,}([1-9] [0-9]*)?' instead. Explanation -

[a-z]{3,} will match a-z at least 3 times, this is shorter version of [a-z]{3}[a-z]* that you are currently using.
([1-9] [0-9]*) would ensure that at least 1 character from 1-9 would be there( means at least one character should match), followed by any number of 0-9 characters.
Make the numbers group optional in the end with the ?

You can try out the above regex here.

Here's a quick example showing various user names -

import re
users = ['xyz123', 'xyz1234', 'xyz01', 'abdcd213', 'xy1234']
username_pattern = r'^[a-z]{3}[a-z]*([1-9] [0-9]*)?'
for user in users:
    if not re.fullmatch(username_pattern, user):
        print(f'Invalid user: {user}')

Output:

Invalid user: xyz01
Invalid user: xy1234