Given this body of text:
First Citizen:
Before we proceed any further, hear me speak.
What authority surfeits on would relieve us:
Speak, speak.
ALL:
You are all resolved rather to die than to famish?
I would like to match:
['First Citizen', 'ALL']
I originally tried something like this r'([\w -:]*:)'
but want to limit it to lines with only 2 words.
Specifications:
- Line ends with :
- Line only has two words or less
- match those one or two words
CodePudding user response:
Any word: \w
Any two words (with whitespace): \w \s\w
Any two words or less (assuming one, not zero): \w (?:\s\w )?
Any two words or less on their own line: ^\w (?:\s\w )?$
Any two words or less on their own line ending in ":": ^\w (?:\s\w )?:$
The result in (Python) code:
import re
text = """
First Citizen:
Before we proceed any further, hear me speak.
What authority surfeits on would relieve us:
Speak, speak.
ALL:
You are all resolved rather to die than to famish?
"""
for match in re.findall(r"^\w (?:\s\w )?:$", text, re.MULTILINE):
print(match)
The output:
First Citizen:
ALL:
You didn't specify a language, so your language may need one or two parameters to deal with the regex in the right way.