Home > Software design >  Perl 5 : assignment of an anonymous arrayref, where the array is still empty = copy-construction?
Perl 5 : assignment of an anonymous arrayref, where the array is still empty = copy-construction?

Time:07-25

I thought I knew a bit about Perl references and how to work with them. I cut my teeth on Perl 5.005. Right now I have a piece of code, fresh written in Perl 5.32, where I'm stumped by the behavior of some array reference operations.

Here is my minimal example:

#!/usr/bin/perl

my $array_ref = (); # create an anonymous array and keep a reference to it
my $another_ref = $array_ref; # assign the reference (not an array deep copy, or is it?)

print "Pushing foo and bar\n";
push @{ $array_ref }, "foo";
push @{ $array_ref }, "bar";

print "Array ref element count: " . $#{ $array_ref } . "\n";
print "Another ref element count: " . $#{ $another_ref } . "\n";

print "Pushing baz\n";
push @{ $array_ref }, "baz";

print "Array ref element count: " . $#{ $array_ref } . "\n";
print "Another ref element count: " . $#{ $another_ref } . "\n";

The resulting output:

# ./test.pl
Pushing foo and bar
Array ref element count: 1
Another ref element count: -1
Pushing baz
Array ref element count: 2
Another ref element count: -1

My production code creates an empty anonymous array = takes a reference to it, then stores a copy of the reference in some dynamic data structure, and then proceeds to add some elements to the anonymous array, using the original "my" (local) reference, before that "my" local label goes out of scope. Curiously to me, while the elements do get added to the originally obtained array reference, they do not appear via the "copied reference" = as if the line

my $another_ref = $array_ref;

behaved more like a copy-construction, and I ended up with a new, independent array. I.e., by the observed behavior, it appears to perform a deep copy. No syntax error gets reported by Perl.

It then occurred to me, to try the arrayref assignment after some elements were pushed unto the original array. A single line has moved in the source code:

#!/usr/bin/perl

my $array_ref = (); # create an anonymous array and keep a reference to it
#my $another_ref = $array_ref; # moved below:

print "Pushing foo and bar\n";
push @{ $array_ref }, "foo";
push @{ $array_ref }, "bar";
my $another_ref = $array_ref; # moved here

print "Array ref element count: " . $#{ $array_ref } . "\n";
print "Another ref element count: " . $#{ $another_ref } . "\n";

print "Pushing baz\n";
push @{ $array_ref }, "baz";

print "Array ref element count: " . $#{ $array_ref } . "\n";
print "Another ref element count: " . $#{ $another_ref } . "\n";

Resulting output:

# ./test.pl
Pushing foo and bar
Array ref element count: 1
Another ref element count: 1
Pushing baz
Array ref element count: 2
Another ref element count: 2

Now that does look like a proper "copy of the reference only".

So I get a deep copy on a reference to an empty array, but a shallow copy on a populated array? Ouch! Is this a feature by any chance? In what way?

This is making me wonder if I do mind. I do have the option to assign the array at the end of my "initial" scope, where the anonymous array gets instantiated. If by the end of that scope the array is still empty, further elements can indeed be added later on, but by then the original reference will have perished, and I'd be accessing the array via the "persistent" reference only anyway = with consistent results = no harm done, just something to be aware of, in my particular situation.

CodePudding user response:

That first line, that declares $array_ref, does not "create an anonymous array"

my $var = ();  # just an uninitialized scalar

merely declares a scalar variable, and assigns an empty list to it, to no effect at all. The variable stays uninitialized. One can see this by printing its ref, which shows an empty string (not ARRAY), or try to print it and you get a warning for using an uninitialized value (not ARRAY(0x...), a stringification for an array reference). Or use Devel::Peek.

So assigning that variable to another as it is declared again has no effect and $another_ref is just another uninitialized, completely unrelated, scalar.

But having declared a scalar, and having not initialized it, still allows us to turn in into a reference. So when you dereference it, to push values onto it, an anonymous array is indeed constructed (via autovivification), and after

push @$var, 1;  # now it "became" an array reference

that $var is an array reference. Well, it's a scalar which value is an array reference, so nothing very strange happened, and this is how it's often done: declare a scalar then later assign/construct a reference to it. However, it may surprise.

In the second attempt the $another_ref is indeed assigned an array reference since in the meanwhile $array_ref did get "elevated" into an array reference. (But that is not a "deep copy" and will be valid only for a reference to an array having no references.)

To declare a scalar and make it into an array reference do

my $array_ref = [];

This is usually unneeded, since an uninitialized scalar intended for an array reference will be made into one once it is used that way.

An exception is if that scalar need be passed into a subroutine which need to be able to tell whether it got a reference or not; then we do need to first assign a reference to it.


Except that perhaps it is strange that we may treat a mere undefined scalar as an array reference (by dereferencing it and pushing values onto it) and that it does become one right there

  • Related