Home > OS >  Python Regex for Searching pattern in text file
Python Regex for Searching pattern in text file

Time:07-24

Tags in Sample.txt:

<ServiceRQ>want everything between...</ServiceRQ>

<ServiceRQ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance>want everything between</ServiceRQ> ..

Please can someone help me to get the regex? To extract the expected output from a text file. I want to create a regex to find the above tags. This is what is have tried re.search(r"<(.*?)RQ(.*?)>(.*?)</(.*?)RQ>", line) but not working properly. I want to make a search based on word RQ in text file

The expected output should be

1. <ServiceRQ>want everything between</ServiceRQ>
2. <ServiceRQ> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance>want everything between</ServiceRQ>

CodePudding user response:

Try this pattern

regex= r'<\w RQ.*?>.*?</\w RQ>'
data=re.findall(regex, line)

The above regex will give output like

['<ServiceRQ>want everything between...</ServiceRQ>', '<ServiceRQ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance>want everything between</ServiceRQ>']

CodePudding user response:

As Ashish has mentioned, this one gives the tag including the contents.

regex= r'<\w RQ.*?>.*?</\w RQ>'
data=re.findall(regex, line)

You can also do this to retrieve JUST the contents within the tags. Changing .*? to (.*?) between the tags.

regex = r'<\w RQ.*?>(.*?)<\/\w RQ>'
data = re.findall(regex, sample)

This would result in the following output:

['want everything between...', 'want everything between']
  • Related