Home > Software engineering >  Pre-allocate logical array with unassigned elements (not true or false)
Pre-allocate logical array with unassigned elements (not true or false)

Time:07-22

I'm looking for the most efficient method of pre-allocating a logical array in MATLAB without specifying true or false at the time of pre-allocation.

When pre-allocating e.g. a 1×5 numeric array I can use nan(1,5). To my mind, this is better than using zeros(1,5), since I can easily tell which slots have been filled with data versus those that are yet to be filled. If using the zeros() solution it's hard to know whether any 0s are intentional 0s or just unfilled slots in the array.

I'm aware that I can pre-alloate a logical array using true(1,5) or false(1,5). The problem with these is similar to the use of zeros() in the numeric example; there's no way of knowing whether a slot is filled or not.

I know that one solution to this problem is to treat the array as numeric and pre-allocate using nan(1,5), and only converting to a logical array later when all the slots are filled. But this strikes me as inefficient.

Is there some smart way to pre-allocate a logical array in MATLAB and remain agnostic as to the actual content of that array until it is ready to be filled?

CodePudding user response:

The short answer is no, the point of a logical array is that each element takes a single byte, and the implementation is only capable of storing only two states (true=1 or false=0). You might assume that logicals only need a single bit, but in fact they need 8 bits (a byte) to avoid compromising on performance.

If memory is a concern, you could use a single array instead of a double array, moving from 64-bit to 32-bit numbers and still capable of storing NaN. Then you can cast to logical whenever required (assuming you have no NaNs by that point, otherwise it will error).

If it was really important to track whether a value was ever assigned whilst also reducing memory, you could have a 2nd logical array which you update at the same time as the first, and stores simply whether a value was ever assigned. Then this can be used as a check on whether you have any default values left after assignments. Now we've dropped from 32-bit singles to two 8-bit logicals, which is worse than one logical but still twice as efficient than using floating point numbers for the sake of the NaN. Obviously assignment operations now take twice as long as using a single logical array, I don't know how they compare to float assignments.

Going off-piste, you could make your own class to do this assignment-tracking for you, and display the logical array as if it was capable of storing NaNs. This isn't really recommended but I've written the below code to complete the thought experiment. Note you originally ask for "the most efficient method", in terms of execution time this is definitely not going to be as efficient than the native implementation of logical arrays.

classdef nanBool
    properties
        assigned % Tracks whether element of "value" was ever assigned
        value    % Tracks boolean array
    end
    methods 
        function obj = nanBool(varargin)
            % Constructor: initialise main and tracking arrays to false
            % handles same inputs as using "false()" normally
            obj.value = false(varargin{:});
            obj.assigned = false(size(obj.value));        
        end
        function b = subsref(obj,S)
            % Override the indexing operator so that indexing works like it
            % would for a logical array unless accessing object properties
            if strcmp(S.type,'.')
                b = obj.(S.subs);
            else
                b = builtin('subsref',obj.value,S);            
            end
        end
        function obj = subsasgn(obj,S,B)
            % Override the assignement operator so that the value array is
            % updated when normal array indexing is used. In sync, update
            % the assigned state for the corresponding elements
            obj.value = builtin('subsasgn',obj.value,S,B);
            obj.assigned = builtin('subsasgn',obj.assigned,S,true(size(B)));
        end    
        function disp(obj)
            % Override the disp function so printing to the command window
            % renders NaN for elements which haven't been assigned
            a = double(obj.value);
            a(~obj.assigned) = NaN;
            disp(a);
        end
    end    
end

Test cases:

>> a = nanBool(3,1)

a = 
   NaN
   NaN
   NaN

>> a(2) = true

a = 
   NaN
     1
   NaN

>> a(3) = false

a = 
   NaN
     1
     0

>> a(:) = true

a = 
     1
     1
     1

>> whos a
  Name      Size            Bytes  Class      Attributes

  a         1x1                 6  nanBool       

>> b = false(3,1); whos b
  Name      Size            Bytes  Class      Attributes

  b         3x1                 3  logical        

Note the whos test shows this custom class has the same memory footprint as two logical arrays the same size. It also shows that the size is reported incorrectly, indicating we'd also have to override the size function in our custom class, I'm sure there are lots of other similar edge cases you'd want to handle.

you could check whether there's any "logical NaNs" (unassigned values) with something like this, or add a function which does this to the class:

fullyAssigned = all(a.assigned);

In 21b and newer you can do some more controlled indexing overrides for custom classes instead of subsref and subsasgn, but I can't test this:

https://uk.mathworks.com/help/matlab/customize-object-indexing.html

  • Related