Home > other >  Regex replace word in string that are not in quotes
Regex replace word in string that are not in quotes

Time:02-07

I'm looking to do replacements in unknown third-party inputs in strings that sometimes have quotes among them.

I want to replace a wholeword whereever it occurs unless it's in double or single-quotes, and unless the quote is escaped.

Example: Replacing FOO by BAR

Input:

FOO "FOO" 'FOO' "    1   FOO   2 " ABCFOOXYZ "  str1\"FOO\"str3'FOO'\'\'" '  str1\'FOOstr3"FOO"\"\"' \"FOO\"

Expected output:

BAR "FOO" 'FOO' "    1   FOO   2 " ABCFOOXYZ "  str1\"FOO\"str3'FOO'\'\'" '  str1\'FOOstr3"FOO"\"\"' \"BAR\"

More tests:

name: FOO
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[FOO]
statement = FOO   " is the owner of the house."

Expected output:

name: BAR
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[BAR]
statement = BAR   " is the owner of the house."

I saw this question: Match and replace a word not in quotes (string contains escaped quotes) which I thought was similar and could be a good starting point the accepted answer does not work at all:

https://regex101.com/r/Lfan64/5

If anyone could help me get the expected result from my regex that would be great, thanks.

CodePudding user response:

You can use

const text = String.raw`name: FOO
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[FOO]
statement = FOO   " is the owner of the house."
FOO "FOO" 'FOO' "    1   FOO   2 " ABCFOOXYZ "  str1\"FOO\"str3'FOO'\'\'" '  str1\'FOOstr3"FOO"\"\"' \"FOO\"`
console.log( text.replace(
  /((?:[^\\]|^)(?:\\{2})*(?:"[^"\\]*(?:\\[^][^"\\]*)*"|'[^'\\]*(?:\\[^][^'\\]*)*'))|FOO/g,
  (match, group) => group || "BAR"
))

Details:

  • ((?:[^\\]|^)(?:\\{2})*(?:"[^"\\]*(?:\\[^][^"\\]*)*"|'[^'\\]*(?:\\[^][^'\\]*)*')) - Group 1:
    • (?:[^\\]|^) - a char other than \ or start of string
    • (?:\\{2})* - zero or more sequences of double backslash
    • (?:"[^"\\]*(?:\\[^][^"\\]*)*"|'[^'\\]*(?:\\[^][^'\\]*)*') - either of a double or single quoted string literal pattern with escape sequence support
  • | - or
  • FOO - a FOO string in any other context.

The (match, group) => group || "BAR" replacement means that if Group 1 matches, the replacement is Group 1 value, else, the replacement is BAR.

CodePudding user response:

If I am understanding your requirements correctly, you may try this regex for your cases:

((['"])(?:\\.|(?!\2).)*(?<!\\)\2)|\bFOO\b/g

Updated RegEx Demo

This regex uses alternation to match and discard what we need to keep on LHS of | whereas on RHS we match whatever we want to replace in the result.

Code:

const str = String.raw`name: FOO
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[FOO]
statement = FOO   " is the owner of the house."
FOO "FOO" 'FOO' "    1   FOO   2 " ABCFOOXYZ "  str1\"FOO\"str3'FOO'\'\'" '  str1\'FOOstr3"FOO"\"\"' \"FOO\"`;

var repl = str.replace(/((['"])(?:\\.|(?!\2).)*(?<!\\)\2)|\bFOO\b/g,
  (_, g) => g || "BAR");
  
console.log(repl);  

RegEx Details:

  • (: Start capture group #1
    • (['"]): Match ' or " in capture group #2
    • (?:\\.|(?!\2).)*: Match an escaped character or any character except the quote we matched in capture group #2
    • (?<!\\)\2: Match whatever quote we matched in capture group #2 as long as it is not preceded by a \
  • ): End capture group #1
  • |: OR
  • \bFOO\b: Match complete word FOO
  •  Tags:  
  • Related