I have a tree represented by strings, for example:
A = '1@2@3@'
Where 1 is the parent of 2, which is the parent of 3, etc.
I'm trying to get a regular expression which matches only direct children of A.
For instance, if we have:
B = '1@2@3@4@' #child of A
C = '1@2@3@4@5@' #not child of A
D = '1@2@3@5@' #child of A
What is the regex that, using A, matches B and D, but not C?
Edit: The number of digits between "@" is arbitrary
CodePudding user response:
You can do this without regex:
def ischildof(parent, child):
return child.startswith(parent) and "@" not in child[len(parent):-1] and child[-1] == "@"
Or with regex -- but maybe that's overkill:
import re
def ischildof(parent, child):
return not not re.fullmatch(re.escape(parent) r"\d @", child)
Example run (for either of the above):
A = '1@2@3@'
print(ischildof(A, '1@2@3@4@')) # True
print(ischildof(A, '1@2@3@4@5@')) # False
print(ischildof(A, '1@2@3@5@')) # True
print(ischildof(A, '1@2@3@')) # False
CodePudding user response:
If you want to match a string you can write it out directly like this
A = '1@2@3@'
str_regex = f"{A}"
If you want to match any number at least one you can do this
num_regex = "[0-9] "
If you want your regex to match until the end of a string, you need to do like this
end_str_regex = "(abc)$"
This will force any string that does not end with 'abc' not a match. I believe this would solve your direct children problem. I think your regex would look like this:
f"{A}[0-9] @$"
Feel free to modify it if you have further criteria. f-string with regex as above can mess things up so raw f-string or string concatenation can be used. You can read more about regex here: https://docs.python.org/3/library/re.html.