Home > Net >  Code does not remove non-ascii characters from variable
Code does not remove non-ascii characters from variable

Time:09-17

Why do the following lines of code not remove non-ascii characters from my variable and replace it with a single space?

$text =~ s/[[:^ascii:]] / /rg;
$text =~ s/\h / /g;

Whereas this works to remove newline?

$log_mess =~ s/[\r\n] //g;

CodePudding user response:

To explain the problem for anyone finding this question in the future:

$text =~ s/[[:^ascii:]] / /rg;

The problem is the /r option on the substitution operator (s/.../.../).

This operator is documented in the "Regexp Quote-Like Operators" section of perlop. It says this about /r:

r - Return substitution and leave the original string untouched.

You see, in most cases, the substitution operator works on the string that it is given (e.g. your variable $text) but in some cases, you don't want that. In some cases, you want the original variable to remain unchanged and the altered string to be returned so that you can store it in a new variable.

Previously, you would do this:

my $new_var = $var;
$new_var =~ s/regex/substitution/;

But since the /r option was added, you can simplify that to:

my $new_var = $var =~ s/regex/substitution/r;

I'm not sure why you used /r in your code (I guess you copied it from somewhere else), but you don't need it here and it's what is leading to your original string being unchanged.

  •  Tags:  
  • perl
  • Related