Home > Back-end >  Python Regular Expression Matching
Python Regular Expression Matching

Time:06-18

I have a tree represented by strings, for example:

A = '1@2@3@'

Where 1 is the parent of 2, which is the parent of 3, etc.

I'm trying to get a regular expression which matches only direct children of A.

For instance, if we have:

B = '1@2@3@4@' #child of A
C = '1@2@3@4@5@' #not child of A
D = '1@2@3@5@' #child of A

What is the regex that, using A, matches B and D, but not C?

Edit: The number of digits between "@" is arbitrary

CodePudding user response:

You can do this without regex:

def ischildof(parent, child):
    return child.startswith(parent) and "@" not in child[len(parent):-1] and child[-1] == "@"

Or with regex -- but maybe that's overkill:

import re

def ischildof(parent, child):
    return not not re.fullmatch(re.escape(parent)   r"\d @", child)

Example run (for either of the above):

A = '1@2@3@'
print(ischildof(A, '1@2@3@4@'))  # True
print(ischildof(A, '1@2@3@4@5@'))  # False
print(ischildof(A, '1@2@3@5@'))  # True
print(ischildof(A, '1@2@3@'))  # False

CodePudding user response:

If you want to match a string you can write it out directly like this

A = '1@2@3@'
str_regex = f"{A}"

If you want to match any number at least one you can do this

num_regex = "[0-9] "

If you want your regex to match until the end of a string, you need to do like this

end_str_regex = "(abc)$"

This will force any string that does not end with 'abc' not a match. I believe this would solve your direct children problem. I think your regex would look like this:

f"{A}[0-9] @$"

Feel free to modify it if you have further criteria. f-string with regex as above can mess things up so raw f-string or string concatenation can be used. You can read more about regex here: https://docs.python.org/3/library/re.html.

  • Related