Home > Enterprise >  Regex match up to the LAST occurrence of a pattern (e.g. </div>) BEFORE another matching patte
Regex match up to the LAST occurrence of a pattern (e.g. </div>) BEFORE another matching patte

Time:12-17

In other words, there can be no other occurrence of the pattern between the end of the match and the second pattern. This needs to be implemented in a single regular expression.

In my specific case I have a page of HTML and need to extract all the content between

<w-block-content><span><div>

and

</div></span></w-block-content>

where

  • the elements might have attributes
  • the HTML might be formatted or not - there might be extra white space and newlines
  • there may be other content between any of the above tags, including inner div elements within the above outer div. But you can assume
    • each <w-block-content> element contains ONLY ONE direct child
      • <span> element, which contains ONLY ONE direct child
        • <div> element, which wraps
          • the content that must be extracted
  • Related