Home > OS >  RegEx find string between lookbehind and lookahead
RegEx find string between lookbehind and lookahead

Time:09-14

so I have this example string out of a html mail given:

Abholstellenname (Firmenname, Details): Musterfirma GmbH<br>

I'm using the following expression to find the company name, in this case Musterfirma GmbH:

(?<=Abholstellenname \(Firmenname, Details\): ).*

But I need to exclude the <br> tag following the company name. How can I achieve this?

I would not ask here if I haven't read through the tutorials and still didn't get it.

CodePudding user response:

You can use

(?<=Abholstellenname \(Firmenname, Details\): ).*?(?=<br>|$)

The main idea is to turn the .* part into a .*?(?=<br>|$) pattern that matches any zero or more chars other than line break chars as few as possible followed with either <br> or end of string.

See the regex demo.

If the spaces can be any whitespace chars, replace the literal spaces in the pattern with \s.

CodePudding user response:

You would need to escape spaces with \s and escape parenthesis with \( and \)

[^<br>] matches any char other than <, >, b and r. This could work for your <br> but if you have anything after that, it will be captured again.

(?<=Abholstellenname\s\(Firmenname,\sDetails\):\s).*[^<br>]
  • Related