I have a simple list in a text file:
animal
ball
cat
dog
elephant
fox
I am trying to get Python 3.8 to search the list and return everything from ball to dog, inclusive. The contents of this text file are in the data variable:
x = re.findall("ball(.*\n)*?.*dog", data) 'a'
print(x)
On regexr.com and in Notepad this works fine:
ball
cat
dog
but in Python the result is:
'cat\n'
How can I make it work the way I want in Python?
CodePudding user response:
I dont understand why you need findall()
in the first place..
re.search()
itself is enough to accomplish this..
In [1]: import re
...: str1= 'animal,ball,cat,dog,elephant,fox'
...: m = re.search("(ball.*dog)", str1)
...:
In [2]: m.groups()
Out[2]: ('ball,cat,dog',)
CodePudding user response:
Make the group non-capturing because if there is just one capturing group, findall
returns a list of strings matching just that group.
"ball(?:.*\n)*?.*dog"
Alternatives:
ball(?:.|\s)*?dog
(?s)ball.*?dog
CodePudding user response:
I am assuming that your string is separated by newline (\n).
Try this:
string = """
animal
ball
cat
dog
elephant
fox
"""
matches = re.findall(r"ball[\w\n] dog", string)
if matches:
list_of_animals = matches[0].split("\n")
print(list_of_animals)
CodePudding user response:
You can use search function of re instead of findall and then use group function of re.Match object
import re
data = """
animal
ball
cat
dog
elephant
fox
"""
x = re.search("ball(.*\n)*?.*dog", data)
print(x.group())