Home > Back-end >  Regex match (replace) all occurrences of double quotes in words between span tags
Regex match (replace) all occurrences of double quotes in words between span tags

Time:11-03

I'm trying to replace all occurrences of " between two span tags.

I use:

(?<=<span>[a-zA-Z0-9_æøåÆØÅ_,.;:!#€%&\/()$§'])*(\")(?=[a-zA-Z0-9_æøåÆØÅ_,.;:!#€%&\/()$§']*<\/span>)

Lookbehind for letters specialChars

find "

Lookahead for letters specialChars

But with the html string

<span>d"s"s"</span>

It only matches the last occurrence of the "

How can I match (eventually replace) all occurrences of double quotes within the tag?

Thanks in advance.

CodePudding user response:

Don't bother the the look behind. Instead, match " where </span> follows without finding <span> earlier than </span>, ie " is inside a span open/close pair:

"(?=((?!<span>).)*<\/span>)

See live demo.

Breaking down the regex:

  • " a literal quote
  • (?!<span>). any character except the < of <span>
  • ((?!<span>).)* any characters up to, but not including, the < of <span>
  • (?=((?!<span>).)*<\/span>) followed by input that encounters </span> before <span>

CodePudding user response:

Use

/(?<=<span>[^<>]*)"(?=[^<>]*<\/span>)/g

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    <span>                   '<span>'
--------------------------------------------------------------------------------
    [^<>]*                   any character except: '<', '>' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  "                        '"'
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    [^<>]*                   any character except: '<', '>' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    <                        '<'
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    span>                    'span>'
--------------------------------------------------------------------------------
  )                        end of look-ahead
  • Related