In the array
library in Python, what's the most efficient way to preallocate with zeros (for example for an array size that barely fits into memory)?
Creating a huge list first would partially defeat the purpose of choosing the array
library over lists for efficiency.
Is it possible to preallocate even faster, without spending time on overwriting whatever there was in memory?
(Preallocating is important, because then the code is faster than growing an array iteratively, as discussed here.)
CodePudding user response:
Use sequence multiplication, just like you would to create a giant list, but with an array:
arr = array.array(typecode, [0]) * size_needed
Note the position of the parentheses. We directly multiply a 1-element array by an element count, rather than multiplying a 1-element list by an element count and then converting to an array.
CodePudding user response:
Using the *
operator is probably best. A much less efficient method:
from array import array
from itertools import repeat
arr = array(typecode, [])
arr.extend(repeat(0, size_needed))
You can shorten this with the iterable version of the constructor:
arr = array(repeat(0, size_needed))
Or, to avoid the extra import:
arr = array(0 for _ in range(size_needed))
The reason that this is much less efficient is that array.extend
(which is what the iterable constructor delegates to) does not use the length or __length_hint__
of the iterable object that is passed in. Consequently, the array will be expanded with multiple reallocations until you reach the target size. This may end up being almost as inefficient as allocating the list. While repeat
does define a proper __length_hint__
, generator expressions do not, but that is a moot point for the current CPython implementation.