Is it possible to extract text contained in a specific character set {} when it is repeated multiple times in the string?
string = 'Revenue for the period is {value="32", font="34"} and EBITDA is {value="12", font="34"} for 2022'
The output should contain the instances as elements in a list in string format.
output = ['{value="32", font=34}', '{value="12", font=34}']
CodePudding user response:
You can use re
to apply a regex pattern to match text in curly brackets:
import re
# Dont use str (a built-in class) as a variable name
string = 'Revenue for the period is {value="32", font="34"} and EBITDA is {value="12", font="34"} for 2022'
output = re.findall(r'{[^}] }', string)
>>> output
['{value="32", font="34"}', '{value="12", font="34"}']
Note that output
will be an empty list
if there are no matches
Online demo: https://regex101.com/r/XB7U9g/1
My reccomendation for learning regex patterns: https://regexone.com/
CodePudding user response:
You can use the function re.findall():
import re
text = (
'Revenue for the period is {value="32", font="34"} and EBITDA is {value="12", '
'font="34"} for 2022'
)
pattern = re.compile(r'{value="\d ", font="\d "}')
print(re.findall(pattern, text))
Output:
['{value="32", font="34"}', '{value="12", font="34"}']