Home > front end >  How to implement a regex that detects javascript comments into php
How to implement a regex that detects javascript comments into php

Time:01-13

I have a regex (enter image description here

But I'm not sure how to put this into php to remove those detected items. I've tried using a preg_replace_callback (https://onlinephp.io/c/2d3e9) but I don't seem to get the result I want

$html=<<<'PATTERN'
doSomething('aaaaa//cccccccc'); // c1ccccccc
/* c2cc' cc'ccc */
doSomething2(111, 222, 333); // c3ccccccc
abc.replace(/'/g, 'aaaaaa//aaaaa'); /* c4ccccccc */
abc.replace(/"/g, 'aaaaaaa'); /* c5ccccccc */
doSomething("<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>",1234);// c6ccccccc
doSomething('<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>',1234);// c7ccccccc
PATTERN;

$regex=<<<'PATTERN2'
~((["'])(?:\\[\s\S]|.)*?\2|(?:[^\w\s]|^)\s*\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/(?=[gmiy]{0,4}\s*(?![*\/])(?:\W|$)))|\/\/.*?$|\/\*[\s\S]*?\*\/~
PATTERN2;

$newJS = preg_replace_callback($regex
            , function ($m) {
                if ( strcmp(substr($m[0], 0, 2), "/*")==0 ) return "xx";
                if ( strcmp(substr($m[0], 0, 2), "//")==0 ) return "xx";
                return $m[0];
            }, $js);

resulting in

doSomething('aaaaa//cccccccc'); // c1ccccccc
xx
doSomething2(111, 222, 333); // c3ccccccc
abc.replace(/'/g, 'aaaaaa//aaaaa'); xx
abc.replace(/"/g, 'aaaaaaa'); xx
doSomething("<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>",1234);// c6ccccccc
doSomething('<div>aaaaaaaa//aaaaaaaaaaaaa aaaaaaa aaaaaaa</div>',1234);xx

So how do I do this?

CodePudding user response:

First of all: regex is not the right tool for this. For instance, it does not recognise JavaScript template literals, which have their own particularities (e.g. multiline, used with String.raw, ...).

But to your immediate issue: the difference between your regex101 and PHP attempts is that the second lacks the multiline pattern modifier, which means the ^ and $ anchors are interpreted differently.

Fix it by appending m at the end of the regex:

$regex=<<<'PATTERN2'
~((["'])(?:\\[\s\S]|.)*?\2|(?:[^\w\s]|^)\s*\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/(?=[gmiy]{0,4}\s*(?![*\/])(?:\W|$)))|\/\/.*?$|\/\*[\s\S]*?\*\/~m
PATTERN2;
  • Related