Home > database >  Can python extract text within repeated specific characters?
Can python extract text within repeated specific characters?

Time:06-14

Is it possible to extract text contained in a specific character set {} when it is repeated multiple times in the string?

string = 'Revenue for the period is {value="32", font="34"} and EBITDA is {value="12", font="34"} for 2022'

The output should contain the instances as elements in a list in string format.

output = ['{value="32", font=34}', '{value="12", font=34}']

CodePudding user response:

You can use re to apply a regex pattern to match text in curly brackets:

import re

# Dont use str (a built-in class) as a variable name
string = 'Revenue for the period is {value="32", font="34"} and EBITDA is {value="12", font="34"} for 2022'

output = re.findall(r'{[^}] }', string)
>>> output
['{value="32", font="34"}', '{value="12", font="34"}']

Note that output will be an empty list if there are no matches

Online demo: https://regex101.com/r/XB7U9g/1

My reccomendation for learning regex patterns: https://regexone.com/

CodePudding user response:

You can use the function re.findall():

import re

text = (
    'Revenue for the period is {value="32", font="34"} and EBITDA is {value="12", '
    'font="34"} for 2022'
)
pattern = re.compile(r'{value="\d ", font="\d "}')
print(re.findall(pattern, text))

Output:

['{value="32", font="34"}', '{value="12", font="34"}']
  • Related