Home > Software design >  Use Regex to find all occurrences of a specific string and pair it with subsequent occurrences of an
Use Regex to find all occurrences of a specific string and pair it with subsequent occurrences of an

Time:05-21

right now I have a test.txt file that im reading in. it has several new line characters so I am using re.DOTALL. How can I combine subsequent patterns into pairs?

test.txt:

      blah blah blah||| blah blah|| 
                Key_one1_end  ||   blah blah
                blah blah || blah
blah blah |||||| blah blah Value_number : 10
      blah blah blah||| blah blah|| 
                Key_two2_end  ||   blah blah
                blah blah || blah
                      Value_number : f

This is my code

f = open(r'path/to/file/test.txt')
list= re.findall('(Key_\w*_end)|(Value_number...\w*)', f.read(), re.DOTALL)
print (list)
output: [('Key_one1_end', ''), ('', 'Value_number : 10'), ('Key_two2_end', ''), ('', 'Value_number : f')]

I want the output to look like this [('Key_one1_end','Value_number : 10'), ('Key_two2_end', 'Value_number : f')]

any suggestions?

CodePudding user response:

pattern1|pattern2 matches either of the patterns, so each match in the list will just contain one of those matches.

If you want to combine them in a single match, don't use an alternative. Use a wildcard to match the text between the two patterns.

list= re.findall('(Key_\w*_end).*?(Value_number...\w*)', f.read(), re.DOTALL)
  • Related