I basically try to find all paragraphs (in javascript/jquery) in a text, that are not yet wrapped in a set of defined html-tags:
p|h1|h2|h3|h4|h5|h6|blockquote|img|table|iframe
My current regex (https://regex101.com/r/O4i2hP/1) already matches paragraphs and excludes the defined tags
(. ?(?<![</(p|h1|h2|h3|h4|h5|h6|blockquote|img|table|iframe)>]$))(\n|$) /gm
but I just don't get, how to just match whole tags only.
The problem is:
(p|h1|h2|h3|h4|h5|h6|blockquote|img|table|iframe)> matches a single character in the list (p|h123456blockquteimgafr)> (case sensitive)
Thus, as you can see from the example, code that is wrapped in tags such as <strong>TEXT</strong>
is also excluded.
I tried different things such as word boundaries \bword\b
, but didn't get it working. I hope you can help. Thx
CodePudding user response:
This will do it.
^(?!<(p|h1|h2|h3|h4|h5|h6|blockquote|img|table|iframe) ?>.</\1>).$
CodePudding user response:
I now found a working approach. The tags should be wrapped in groups rather than in character classes. The following works for me:
(. ?(?<!(<\/)(p|h1|h2|h3|h4|h5|h6|blockquote|img|table|iframe)(>)$))(\n|$) /gm
see also: https://regex101.com/r/DC5msM/1