Home > Software design >  Substitute the markdown italic to html using regex in Perl
Substitute the markdown italic to html using regex in Perl

Time:02-12

To convert the markdown italic text $script into html, I've written this:

my $script = "*so what*";
my $res =~ s/\*(.)\*/$1/g;
print "<em>$1</em>\n";

The expected result is:

<em>so what</em>

but it gives:

<em></em>

How to make it give the expected result?

CodePudding user response:

Problems:

  • You print the wrong variable.
  • You switch variable names halfway through.
  • . won't match more than one character.
  • You always add one EM element, even if no stars are found.
  • You always add one EM element, even if multiple pairs of stars are found.
  • You add the EM element around the entire output, not just the portion in stars.

Fix:

$script =~ s{\*([^*] )\*}{<em>$1</em>}g;
print "$script\n";

or

my $res = $script =~ s{\*([^*] )\*}{<em>$1</em>}gr;
print "$res\n";

But that's not it. Even with all the aforementioned problems fixed, your parser still has numerous other bugs. For example, it misapplies italics for all of the following:

  • **Important**
    Correct: Important
    Your code: *Important*
  • 4 * 5 * 6 = 120
    Correct: 4 * 5 * 6 = 120
    Your code: 4 5 6 = 120
  • 4 * 6 = 20 is *wrong*
    Correct: 4 * 6 = 20 is wrong
    Your code: 4 6 = 20 is wrong*
  • `foo *bar* baz`
    Correct: foo *bar* baz
    Your code: `foo bar baz`
  • \*I like stars\*
    Correct: *I like stars*
    Your code: \I like stars\
  • Related