Home > Software design >  How can I change font size specified in an HTML document using Perl?
How can I change font size specified in an HTML document using Perl?

Time:11-15

I am modifying some HTML pages and want to increase the font size dynamically with a regex. In my script below, I want the '8' and '3' to turn into '9' and '4' but I get '8 ' and '3 ', respectively. I have the following:

#!/usr/bin/perl
use warnings;
use LWP::Simple;

my $content = "<TD><FONT STYLE=\"font-family:Verdana, Geneva, sans-serif\" SIZE=\"8\">this is just a bunch of text</FONT></TD>";
$content .= "<TD><FONT STYLE=\"font-family:Verdana, Geneva, sans-serif\" SIZE=\"3\">more text</FONT></TD>";

$content=~s/SIZE="(\d )">/SIZE="$1  ">/g;

print $content;     

CodePudding user response:

I'll just skip the part about how regexps are a bad way to parse HTML, because sometimes a quick-and-dirty solution is good enough.

You can't use an operator inside a string like that. The is just treated as plain text (as you found). You have to use the /e flag to indicate that the replacement should be evaluated as Perl code, and then use the appropriate expression, like:

$content =~ s/SIZE="(\d )">/'SIZE="' . ($1   1) . '">'/eg;

You can't use $1 for two reasons. First, it would do the increment after returning the value, so you'd be replacing 8 with 8 instead of 9. Second, $1 is a read-only value, and the increment would want to modify it.

CodePudding user response:

You should consider using an HTML parser such as HTML::TokeParser::Simple:

#!/usr/bin/perl

use strict; use warnings;

use HTML::TokeParser::Simple;

my $content = "<TD><FONT STYLE=\"font-family:Verdana, Geneva, sans-serif\" SIZE=\"8\">this is just a bunch of text</FONT></TD>";
$content .= "<TD><FONT STYLE=\"font-family:Verdana, Geneva, sans-serif\" SIZE=\"3\">more text</FONT></TD>";

my $parser = HTML::TokeParser::Simple->new( \$content );

while ( my $token = $parser->get_token ) {
    if ( $token->is_start_tag('font') ) {
        my $font_size = $token->get_attr('size');
        if ( defined $font_size ) {
               $font_size;
            $token->set_attr(size => $font_size);
        }
    }
    print $token->rewrite_tag->as_is;
}

Output:

<td><font style="font-family:Verdana, Geneva, sans-serif" size="9">this is just
a bunch of text</font></td><td><font style="font-family:Verdana, Geneva, 
sans-serif" size="4">more text</font></td>

CodePudding user response:

Use the e modifier/flag to execute scripts inside the regex, e.g.

$content=~s/SIZE="(\d )">/'SIZE="'.($1 1).'">'/ge;

CodePudding user response:

#!/usr/bin/perl -w    

use strict;    

   sub main{    
      my $c = qq{&lt;TD>&lt;FONT STYLE="font-family:Verdana, Geneva, sans-serif" SIZE="8">this is just a bunch of text&lt;/FONT>&lt;/TD>\n}
            . '&lt;TD>&lt;FONT STYLE="font-family:Verdana, Geneva, sans-serif" SIZE="3">more text&lt;/FONT>&lt;/TD>';

      $c =~ s/(SIZE=\")(\d )(\")/$_=$2 1;"$1$_$3"/eg;

      print "$c\n";      
         #&lt;TD>&lt;FONT STYLE="font-family:Verdana, Geneva, sans-serif" SIZE="9">this is just a bunch of text&lt;/FONT>&lt;/TD>
         #&lt;TD>&lt;FONT STYLE="font-family:Verdana, Geneva, sans-serif" SIZE="4">more text&lt;/FONT>&lt;/TD>  
   }    

   main();    
  • Related