But I'm not sure how to put this into php to remove those detected items. I've tried using a preg_replace_callback (https://onlinephp.io/c/2d3e9) but I don't seem to get the result I want
$html=<<<'PATTERN'
doSomething('aaaaa//cccccccc'); // c1ccccccc
/* c2cc' cc'ccc */
doSomething2(111, 222, 333); // c3ccccccc
abc.replace(/'/g, 'aaaaaa//aaaaa'); /* c4ccccccc */
abc.replace(/"/g, 'aaaaaaa'); /* c5ccccccc */
doSomething("<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>",1234);// c6ccccccc
doSomething('<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>',1234);// c7ccccccc
PATTERN;
$regex=<<<'PATTERN2'
~((["'])(?:\\[\s\S]|.)*?\2|(?:[^\w\s]|^)\s*\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/(?=[gmiy]{0,4}\s*(?![*\/])(?:\W|$)))|\/\/.*?$|\/\*[\s\S]*?\*\/~
PATTERN2;
$newJS = preg_replace_callback($regex
, function ($m) {
if ( strcmp(substr($m[0], 0, 2), "/*")==0 ) return "xx";
if ( strcmp(substr($m[0], 0, 2), "//")==0 ) return "xx";
return $m[0];
}, $js);
resulting in
doSomething('aaaaa//cccccccc'); // c1ccccccc
xx
doSomething2(111, 222, 333); // c3ccccccc
abc.replace(/'/g, 'aaaaaa//aaaaa'); xx
abc.replace(/"/g, 'aaaaaaa'); xx
doSomething("<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>",1234);// c6ccccccc
doSomething('<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>',1234);xx
So how do I do this?
CodePudding user response:
First of all: regex is not the right tool for this. For instance, it does not recognise JavaScript template literals, which have their own particularities (e.g. multiline, used with String.raw
, ...).
But to your immediate issue: the difference between your regex101 and PHP attempts is that the second lacks the multiline pattern modifier, which means the ^
and $
anchors are interpreted differently.
Fix it by appending m
at the end of the regex:
$regex=<<<'PATTERN2'
~((["'])(?:\\[\s\S]|.)*?\2|(?:[^\w\s]|^)\s*\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/(?=[gmiy]{0,4}\s*(?![*\/])(?:\W|$)))|\/\/.*?$|\/\*[\s\S]*?\*\/~m
PATTERN2;