Home > Software design >  Local variable visibility in closures vs. local `sub`s
Local variable visibility in closures vs. local `sub`s

Time:07-07

Perl 5.18.2 accepts "local subroutines", it seems.

Example:

sub outer()
{
    my $x = 'x';   # just to make a simple example

    sub inner($)
    {
        print "${x}$_[0]\n";
    }

    inner('foo');
}

Without "local subroutines" I would have written:

#...
    my $inner = sub ($) {
        print "${x}$_[0]\n";
    }

    $inner->('foo');
#...

And most importantly I would consider both to be equivalent.

However the first variant does not work as Perl complains:

Variable $x is not available at ...

where ... describes the line there $x is referenced in the "local subroutine".

Who can explain this; are Perl's local subroutines fundamentally different from Pascal's local subroutines?

CodePudding user response:

The term "local subroutine" in the question seems to be referring to lexical subroutines. These are private subroutines visible only within the scope (block) where they are defined, after the definition; just like private variables.

But they are defined (or pre-declared) with my or state, as my sub subname { ... }

Just writing a sub subname { ... } inside of another doesn't make it "local" (in any version of Perl), but it is compiled just as if it were written alongside that other subroutine and is placed in their package's symbol table (main:: for example).


The question mentions closure in the title and here is a comment on that

A closure in Perl is a structure in a program, normally a scalar variable, with a reference to a sub and which carries environment (variables) from its scope at its (runtime) creation. See also a perlfaq7 entry on it. Messy to explain. For example:

sub gen { 
    my $args = "@_"; 

    my $cr = sub { say "Closed over: $args, my args: @_" }
    return $cr;
}

my $f = gen( qw(args for gen) );

$f->("hi closed");
# Prints:
# Closed over: args for gen, my args: hi closed

The anonymous sub "closes over" the variables in scope where it's defined, in a sense that when its generating function returns its reference and goes out of scope those variables still live on, because of the existence of that reference. Since anonymous subs are created at runtime, every time its generating function is called and lexicals in it remade so is the anon sub, so it always has access to current values. Thus the returned reference to the anon-sub uses lexical data, which would otherwise be gone. A little piece of magic.

Back to the question of "local" subs. If we want to introduce actual closures to the question, we'd need to return a code reference from the outer subroutine, like

sub outer {
    my $x = 'x' . "@_";
    return sub { say "$x @_" }
}
my $f = outer("args");
$f->( qw(code ref) );   # prints:  xargs code ref

Or, per the main question, as introduced in v5.18.0 and stable from v5.26.0, we can use a named lexical (truly nested!) subroutine

sub outer {
    my $x = 'x' . "@_";
    
    my sub inner { say "$x @_" };

    return \&inner;
}

In both cases my $f = outer(...); has the code reference returned from outer which correctly uses the local lexical variables ($x), with their most current values.

But we cannot use a plain named sub inside outer for a closure

sub outer {
    ...

    sub inner { ... }  # misleading, likely misguided and buggy

    return \&inner;    # won't work correctly
}

This inner is made at compile time and is global so any variables it uses from outer will have their values baked from when outer was called the first time. So inner will be correct only until outer is called the next time -- when the lexical environment in outer gets remade but inner doesn't. As an example I can readily find this post, and see the entry in perldiag (or add use diagnostics; to the program).


And in my view a poor-man's object in a way, as it has functionality and data, made elsewhere at another time and which can be used with data passed to it (and both can be updated)

CodePudding user response:

If you want "local" subs, you can use one of the following based on the level of backward compatibility you want:

  • 5.26 :

    my sub inner { ... }
    
  • 5.18 :

    use experimental qw( lexical_subs );  # Safe: Accepted in 5.26.
    
    my sub inner { ... }
    
  • "Any" version:

    local *inner = sub { ... };
    

However, you should not, use sub inner { ... }.


sub f { ... }

is basically the same as

BEGIN { *f = sub { ... } }

so

sub outer {
   ...

   sub inner { ... }

   ...
}

is basically

BEGIN {
   *outer = sub {
      ...

      BEGIN {
         *inner = sub { ... };
      }

      ...
   };
}

As you can see, inner is visible even outside of outer, so it's not "local" at all.

And as you can see, the assignment to *inner is done at compile-time, which introduces another major problem.

use strict;
use warnings;
use feature qw( say );

sub outer {
   my $arg = shift;

   sub inner {
      say $arg;
   }

   inner();
}

outer( 123 );
outer( 456 );
Variable "$arg" will not stay shared at a.pl line 9.
123
123

5.18 did introduce lexical ("local") subroutines.

use strict;
use warnings;
use feature qw( say );
use experimental qw( lexical_subs );  # Safe: Accepted in 5.26.

sub outer {
   my $arg = shift;

   my sub inner {
      say $arg;
   };

   inner();
}

outer( 123 );
outer( 456 );
123
456

If you need to support older versions of Perl, you can use the following:

use strict;
use warnings;
use feature qw( say );

sub outer {
   my $arg = shift;

   local *inner = sub {
      say $arg;
   };

   inner();
}

outer( 123 );
outer( 456 );
123
456

CodePudding user response:

I found a rather good explanation from man perldiag:

       Variable "%s" is not available
           (W closure) During compilation, an inner named subroutine or eval
           is attempting to capture an outer lexical that is not currently
           available.  This can happen for one of two reasons.  First, the
           outer lexical may be declared in an outer anonymous subroutine
           that has not yet been created.  (Remember that named subs are
           created at compile time, while anonymous subs are created at run-
           time.)  For example,

               sub { my $a; sub f { $a } }

           At the time that f is created, it can't capture the current value
           of $a, since the anonymous subroutine hasn't been created yet.

So this would be a possible fix:

sub outer()
{
    my $x = 'x';   # just to make a simple example

    eval 'sub inner($)
    {
        print "${x}$_[0]\n";
    }';

    inner('foo');;
}

...while this one won't:

sub outer()
{
    my $x = 'x';   # just to make a simple example

    eval {
        sub inner($)
        {
            print "${x}$_[0]\n";
        }
    };

    inner('foo');;
}
  • Related