Home > Software design >  Get text between delimeters, Python and regex
Get text between delimeters, Python and regex

Time:12-30

Please help with this code, I`ve get stuck.

import re

test = """some variable text
----------------------------------------
this is neded string
and this is also needed string
there may be many of them
but the last one always ends with this equal mark=

another variable block (one or many strings)=
another variable block (one or many strings)=
----------------------------------------"""
p = re.compile("----------------------------------------\n(.*)=\n\n")
result = p.findall(test)

print(result)

I need to get a text block (one or many strings) between the first line of dashes and the first '=' sign and always an empty string after it.

this code returns an empty list, and I don't understand why?

p.search(test)

returns None.

CodePudding user response:

You can try regex like:

p = re.compile('(?s)----------------------------------------(.*?)=')
# or re.compile('----------------------------------------(.*?)=', re.DOTALL)

result:

['\nthis is neded string\nand this is also needed string\nthere may be many of them\nbut the last one always ends with this equal mark']

Explanation:

  • (?s): This is inline re.DOTALL flag. You can use re.compile(..., re.DOTALL) instead of it as @Matiiss mentioned. It indicates that . should be match all characters including newline character(\n). If this flag doesn't present, . matches all characters excluding newline characters.
  • *?: This is a humble quantifier that matches shortest pattern(preventing matching to last =).

You can use number quantifier instead of repeating -s.

  • re.compile('(?s)-{40}(.*?)=')

and if you don't need to specify exact length, then

  • re.compile('(?s)- (.*?)=') will do(one or more -s). (thanks to @ViettelSolutions)
  • Related