I have the textblock below and I am trying to separate into 3 blocks with regex. When you see the name field it would start a new block. How can I return all 3 blocks?
name: marvin
attribute: one
day: monday
dayalt: test << this is a field that can sometimes show up
name: judy
attribute: two
day: tuesday
name: dot
attribute: three
day: wednesday
import re
lines = """name: marvin
attribute: one
day: monday
dayalt: test << this is a field that can sometimes show up
name: judy
attribute: two
day: tuesday
name: dot
attribute: three
day: wednesday
"""
a=re.findall("(name.*)[\n\S\s]", lines, re.MULTILINE)
Block1 would return as "name: marvin\nattribute: one\nday: monday\ndayalt: test
Thanks!
CodePudding user response:
How about the following, which uses positive lookahead:
# from itertools import groupby
import re
lines = """name: marvin
attribute: one
day: monday
dayalt: test
name: judy
attribute: two
day: tuesday
name: dot
attribute: three
day: wednesday"""
blocks = re.findall(r"name: .*?(?=name: |$)", lines, re.DOTALL)
print(blocks)
# ['name: marvin\nattribute: one\nday: monday\ndayalt: test\n',
# 'name: judy\nattribute: two\nday: tuesday\n',
# 'name: dot\nattribute: three\nday: wednesday']