=SUM(SEQUENCE(10000000))
The formula above is able to sum upto 10 million virtual array elements. We know that 10 million is the limit according to this question and answer. Now, if the same is implemented as Lambda using Lambda helper function REDUCE
:
=REDUCE(,SEQUENCE(10000000),LAMBDA(a,c,a c))
We get,
Calculation limit was reached when trying to calculate this formula
This can happen in 2 cases:
- The computation for the formula takes too long.
- It uses too much memory.
To resolve it, use a simpler formula to reduce complexity.
So, it says the reason is space and time complexity. But what is the exact space used to throw this error? How is this determined?
In the REDUCE
function above, the limit was at around 66k for a virtual array:
=REDUCE(,SEQUENCE(66660),LAMBDA(a,c,a c))
However, if we remove the addition criteria and make it return only the current value c
, the allowed virtual array size seems to increase to 190k:
=REDUCE(,SEQUENCE(190000),LAMBDA(a,c,c))
After which it throws a error. So, what factors determine the memory limit here? I think it's memory limit, because it throws the error almost within a few seconds.
CodePudding user response:
A partial answer
We know for a fact, the following factors decide the the memory limit drum roll
LAMBDA
function calls- Number of operations
The base number for 1 operation seems to be 199992
1 2(=REDUCE(,SEQUENCE(199992),LAMBDA(a,c,c))
). But for a zero-op or a no-op(=REDUCE(,SEQUENCE(10000000),LAMBDA(a,c,0))
), the memory limit is much higher, but you'll still run into time limit. We also know number of operations is a factor, because
=REDUCE(,SEQUENCE(66664/1),LAMBDA(a,c,a c))
fails=REDUCE(,SEQUENCE(66664),LAMBDA(a,c,a c))
works.=REDUCE(,SEQUENCE(66664),LAMBDA(a,c,a c 0))
fails
Note that size of operation doesn't matter. If =REDUCE(,SEQUENCE(39998),LAMBDA(a,c,a c 0))
works, =REDUCE(,SEQUENCE(39998),LAMBDA(a,c,a c 100000))
will also work. For each increase in number of operations inside the lambda function, the maximum allowed array size falls by 2n-1
( Credits to @OlegValter for actually figuring out there's a factor multiple here):
Maximum sequence | Number of operations (inside lambda) |
Reduction (from 199992) |
Formula |
---|---|---|---|
199992 | 1 | 1 | REDUCE(,SEQUENCE(199992),LAMBDA(a,c,c)) |
66664 | 2 | 1/3 | REDUCE(,SEQUENCE(66664),LAMBDA(a,c,a c)) |
39998 | 3 | 1/5 | REDUCE(,SEQUENCE(39998),LAMBDA(a,c,a c 10000)) |
28570 | 4 | 1/7 | REDUCE(,SEQUENCE(28570),LAMBDA(a,c,a c 10000 0)) |
Operations outside the LAMBDA
functions also count. For eg, =REDUCE(,SEQUENCE(199992/1),LAMBDA(a,c,c))
will fail due to extra /1
operation, but you only need to reduce the array size linearly by 1 or 2 per operation, i.e., this =REDUCE(,SEQUENCE(199990/1),LAMBDA(a,c,c))
will work3.
In addition LAMBDA
function calls itself cost more. So, refactoring your code doesn't eliminate the memory limit, but increases it. For eg, if your code uses LAMBDA(a,c,(a-1) (a-1))
, if you add another lambda like this: LAMBDA(a,c,LAMBDA(aminus,aminus aminus)(a-1))
, it errors out with much less array elements than before(~20% less). LAMBDA
is much more expensive than repeating calls.
There are many other factors at play, especially with other LAMBDA
functions. Google might change their mind about these arbitrary limits later. But this gives a start.
CodePudding user response:
not even an answer
by brute-forcing a few ideas it looks like there are more hidden variables than previously thought. it is probably safe to say that the upper limit is a result of "running out of memory" especially when calculation time does not play any role. the thing is that there are factors even outside of LAMBDA
that affects the computational capabilities of the formula.
establishing the baseline:
all below formulae are maxed out so they work as "one step before erroring out". please keep noticing the numbers as a direct representation of row (not cell) processing abilities
starting with a simple:
=ROWS(BYROW(SEQUENCE(99994), LAMBDA(x, AVERAGE(x))))
by adding one more x
the following would error out so even the length of strings matters:
=ROWS(BYROW(SEQUENCE(99994), LAMBDA(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, AVERAGE(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx))))
doubling the array brings no issues:
=ROWS(BYROW({SEQUENCE(99994), SEQUENCE(99994)}, LAMBDA(x, AVERAGE(x))))
but additional "stuff" will reduce the output by 1:
=ROWS(BYROW({SEQUENCE(99993), SEQUENCE(99993, 1, 5)}, LAMBDA(x, AVERAGE(x))))
interestingly this one runs with no problem so now even the complexity of input matters (?):
=ROWS(BYROW(SEQUENCE(99994, 6, 0, 5), LAMBDA(x, AVERAGE(x))))
and with this one, it seems that even choice of formula selection matters:
=ROWS(BYROW(RANDARRAY(99996, 2), LAMBDA(x, AVERAGE(x))))
but what if we move from virtual input to real input... A1 cell being set to =RANDARRAY(105000, 3)
we can have:
=ROWS(BYROW(A1:B99997, LAMBDA(x, AVERAGE(x))))
again, it's not a matter of cells because even with 8 columns we can get the same:
=ROWS(BYROW(A1:H99997, LAMBDA(x, AVERAGE(x))))
not bad, however, indirecting the range will put us back to 99995:
=ROWS(BYROW(INDIRECT("A1:B"&99995), LAMBDA(x, AVERAGE(x))))
another fact is that LAMBDA
as a standalone function runs flawlessly even with an array 105000×8
(that's solid 840K cells)
=LAMBDA(x, AVERAGE(x))(A1:H105000)
so is this really the memory issue of LAMBDA
(?) or the factors that determine the memory used in LAMBDA
are limits of unknown origin bestowed upon LAMBDA
by individual incapabilities of:
MAP
SCAN
BYCOL
BYROW
REDUCE
MAKEARRAY
and their unoptimized memory demands shaken by wast variety of yet unknown variables within our spacetime