Home > Software engineering >  Capture text around string between 2 characters
Capture text around string between 2 characters

Time:02-28

Let's say my full string is this:

blah {something i dont want {sth i want MY STRING blahblah i want } blah

I would like to capture only this part:

{sth i want MY STRING blahblah i want }

Basically anything around and including "MY STRING" until I meet a { and a } on both sides.

I figured it out for the part after MY STRING with this regex MY STRING.*?(?=})} but I don't know how to get the first part AKA {sth i want . I tried with some things but all of those get until the first {

CodePudding user response:

That should do the trick:

import re

text = 'blah {something i dont want {sth i want MY STRING blahblah i want } blah'

output = re.findall(r'{[^{]*?MY STRING.*?}', text)[0]
print(output)  # {sth i want MY STRING blahblah i want }

You can test it here.


Explanation:

  • {: Matches a parenthesis.
  • [^{]: Matches a single character that isn't {
  • *?: Matches the previous character between 0 and unlimited times (lazy).
  • MY STRING: Matches MY STRING.
  • .*?: Matches any character between 0 and unlimited times (lazy).
  • }: Matches a parenthesis.

CodePudding user response:

If you know you'll have balanced braces, you can skip this and .split() for it

value.rsplit('{')[-1].split("}")[0]  # take last, take first
>>> value = "blah {something i dont want {sth i want MY STRING blahblah i want } blah"
>>> value.rsplit('{')[-1].split("}")[0]
'sth i want MY STRING blahblah i want '

With your string, you can now put the braces back on

Note you may want extra checking if no braces exist, to count them, or only take the ends recursively to adapt for your purpose

>>> value = "something simple"
>>> value.rsplit('{')[-1].split("}")[0]
'something simple'
  • Related