Home > other >  Perl script split
Perl script split

Time:12-10

I am not well verse in Perl scripting and have trouble trying to understand how the first split is used in this following Perl script snippet.

On this line:

@splitEachJudge = split / \(/ig,$orig_content;

it does not seem to follow the syntax.

Any kind soul could help explain how does the split here works?

$glb_doc2Cont = "<AJudge>Andrew and Alvin</AJudge>";
$line = "<AJudge>Andrew and Alvin</AJudge>";


if ($line =~ /<AJudge>(.*?)<\/AJudge>/ig) {
        $orig_content = $1;
        $content = $1;

        @splitEachJudge = split / \(/ig,$orig_content;
        print("Last index of array= $#splitEachJudge\n");
        print("EachJudge1: $splitEachJudge[0]\n");
        print("EachJudge2: $splitEachJudge[1]\n");

        do {
                local @ARGV = ($splitEachJudge[1]);
                eval { require 'Cleanup.pl'};
                $judge2 = cleanLeadingTrailingSpace();
                print("Judge2: $judge2\n");     
            };

        if ($#splitEachJudge eq "1") {
            if ($splitEachJudge[0] =~ / and /i) {
                @eachJudgeAnd = split / and /ig,$splitEachJudge[0];
                    do {
                        local @ARGV = ($eachJudgeAnd[0]);
                        eval { require 'Cleanup.pl'};
                        $eachJudgeAnd[0] = cleanLeadingTrailingSpace();
                        local @ARGV = ($eachJudgeAnd[1]);
                        $eachJudgeAnd[1] = cleanLeadingTrailingSpace();
                    };
                if ($eachJudgeAnd[0] =~ /, /i) {
                    $StoreCommaJudge = "";
                        @eachJudgeComma = split /, /ig,$eachJudgeAnd[0];
                        for ($count=0;$count<=$#eachJudgeComma;  $count) {
                            $StoreCommaJudge .= "<Judge>$eachJudgeComma[$count]<\/Judge>, ";
                        }
                        $glb_doc2Cont=~s/<AJudge>\Q$content\E<\/AJudge>/<JCoram>$StoreCommaJudge and <Judge>$eachJudgeAnd[1]<\/Judge> \($judge2<\/JCoram>/ig;
                }else{
                    $glb_doc2Cont=~s/<AJudge>\Q$content\E<\/AJudge>/<JCoram><Judge>$eachJudgeAnd[0]<\/Judge> and <Judge>$eachJudgeAnd[1]<\/Judge> \($judge2<\/JCoram>/ig;
                }
            }
            else{
                $glb_doc2Cont=~s/<AJudge>\Q$content\E<\/AJudge>/<JCoram><Judge>$splitEachJudge[0]<\/Judge> \($judge2<\/JCoram>/ig;
            }
        }
        elsif ($#splitEachJudge eq "0"){
            @splitEachJudge2 = split /:/ig,$orig_content;
            $glb_doc2Cont=~s/<AJudge>\Q$content\E<\/AJudge>/<JCoram><Judge>$splitEachJudge2[0]<\/Judge>:<\/JCoram>/ig;
        }
    }

CodePudding user response:

You have

split / \(/ig, $orig_content;

The g makes no sense, but it's ignored.

The match operator (/ \(/ig) is not evaluated, but passed to split as a pattern. This means the above is equivalent to

split qr/ \(/i, $orig_content;

qr/ \(/i returns a compiled regex pattern. The pattern match a space followed by a left paren. (The i makes the match case-insensitive, which is useless here.)

split therefore search the string in $orig_content for instances of a space followed by left paren. It returns the strings separated by this pattern.

For example,

split / \(/, "aaa (bbb (ccc"

returns three strings: aaa, bbb, ccc.

  •  Tags:  
  • perl
  • Related