I have a string that looks like that "----.-------.-----.---"
.
Here ----
is any substring of random length, .
is the separator and the string can have a predetermined number of separator that I can dynamically change.
How can I use regex to validate that a string new_string
matches this pattern?
I found some solution online, but none account for a random length substring and a dynamic number of separator.
CodePudding user response:
Use in
operator but try to avoid str.count
is possible, as str.split
can also be used to account for this, and IIUC it does the same thing under the hood in any case, so probably worth it to eliminate what could be a duplicate iteration in our case.
Added my timings below just to double check this:
from timeit import timeit
text1 = "aaaa.bb..ccc"
text2 = "aaaa.bbbbbbb.cccc.ddd"
def validate1(text):
return text.count('.') == 3 \
and '..' not in text \
and all(x.isalnum() for x in text.split('.'))
def validate2(text):
if '..' in text:
return False
parts = text.split('.')
return len(parts) == 4 \
and all(x.isalnum() for x in parts)
print('validate1: ', timeit('validate1(text1); validate1(text2)', globals=globals()))
print('validate2: ', timeit('validate2(text1); validate2(text2)', globals=globals()))
assert validate1(text1) is validate2(text1) is False
assert validate1(text2) is validate2(text2) is True
Note that, apparently even all()
times can be slightly improved, by instead having it like:
len([1 for x in parts if x.isalnum()]) == 4
CodePudding user response:
You can use
text.count('.') == 3 and '..' not in text and all(x.isalnum() for x in text.split('.'))
where
text.count('.') == 3
- checks if the string contains exactly three periods'..' not in text
- disallows consecutive dotsall(x.isalnum() for x in text.split('.'))
- makes sure that all parts between dots consist of only alphanumeric chars.