In a file, I have the following lines:
[Line 1] My Name is Adam;
[Line 2] <Blank Line>
[Line 3] My Name
[Line 4] is Adam Lee;
[Line 5] <Blank Line>
[Line 6] My
[Line 7] Name
[Line 8] is
[Line 9] Adam
[Line 10] Lee;
My tokens are: 'My' 'Name' 'Adam' and I know that they would end with ';'
Here is how I have written my code in Python:
#Read the input file
try:
file_path = sys.argv[1]
content = "".join(open(file_path))
my_file = open(file_path).read()
except Exception as err:
print("Exception caught while opening the file!")
print(repr(err))
exit()
# Find matches
my_regex = r"^[ ]*My\s Name.*Adam.*[;/]"
matches = re.findall(my_regex, my_file, flags=re.IGNORECASE re.MULTILINE)
Observation: Only Line 1 is getting matched. My expectation is Line 3-4 and Line 6-10 also get matched since the tokens and the delimiter ticks the boxes. How can I modify my regex? Please help.
CodePudding user response:
You might write the pattern using a negated character class matching any char except a semicolon:
^ *My\s Name[^;]*Adam[^;]*;
^
Start of string*
Match optional spacesMy\s Name
Match My Name with 1 whitespace chars in between[^;]*Adam[^;]*
Match Adam between optional chars other than;
;
Match the;
at the end of the string