Unable to use BYCOL and SCAN in the same formula-CodePudding

I am trying to calculate running (cumulative) totals per column on a range (please see picture below).

If I use one SCAN function per column, it works. But I have to write as many SCAN functions as I have columns.

The problem is that I want to use a single dynamic array formula that includes all the columns, whether current or future.

When I try to use BYCOL together with LAMBDA and SCAN, it does not work. I wonder whether BYCOL is able to work with functions that SPILL.

If BYCOL is incompatible with SCAN, is there a workaround to use a single formula for all my running totals?

CodePudding user response：

On cell D2 put the following formula:

=LET(set, A2:B5, m, COLUMNS(set), seq, SEQUENCE(m,1),
 CUMULATE, LAMBDA(x, SCAN(0, x, LAMBDA(acc,item, acc item))),
 DROP(REDUCE(0,seq, LAMBDA(acc,idx, HSTACK(acc, CUMULATE(INDEX(set,,idx))))),,1)
)

Here is the output:

It also works without removing the first column via DROP if the accumulator is initialized properly, with the cumulative sum of the first column, as follow:

=LET(set, A2:B5, m, COLUMNS(set), seq, SEQUENCE(m,1),
 CUMULATE, LAMBDA(x, SCAN(0, x, LAMBDA(acc,item, acc item))),
 REDUCE(0,seq, LAMBDA(acc,idx, IF(idx = 1, CUMULATE(INDEX(set,,idx)),
  HSTACK(acc, CUMULATE(INDEX(set,,idx))))))
)

Explanation

BYCOL returns one cell per column, that is why you get #CALC! error related to Nested Array Error. It can be circumvent using DROP/REDUCE/HSTACK pattern, explained here for example: how to transform a table in Excel from vertical to horizontal but with different length answer provided by @DavidLeal.

For your particular case, we need to calculate the cumulative sum for each column. We created a user LAMBDA function for that: CUMULATE:

LAMBDA(x, SCAN(0, x, LAMBDA(acc, item, acc item)))

Now we use the pattern to iterate over all columns. REDUCE function needs an input array, so we create a seq name with the column positions. We can access to each column of the input range (set), via INDEX(set,,idx), where idx represents the column number.

Now we use the pattern DROP/REDUCE/HSTACK explained in the above link to generate each column:

DROP(REDUCE(0, arr, LAMBDA(acc, x, HSTACK(acc, func(x)))),,1)

In our case func(x) will be the user LAMBDA function we just created: CUMULATE, so in our case it will be:

DROP(REDUCE(0,seq, LAMBDA(acc,idx, HSTACK(acc, CUMULATE(INDEX(set,,idx))))),,1)

What it does is to invoke for each column of set the CUMULATE LAMBDA function and via HSTACK appends the column created on each iteration. The variable acc represents the accumulator, so we start with initial value of the accumulator 0, then we add recursively via HSTACK the following columns. Finally we need to remove the first column via DROP(result,,1) that represents the first iteration that doesn't calculate the cumulative sum and just generate a column like this one:

0
#N/A
#N/A
#N/A

Note: If an array has fewer rows than the maximum width of the selected arrays, HSTACK returns a #N/A error in the additional rows. That is why you get such values. Bottom line, the first column needs to be removed, because we didn't initialized with a valid value the first iteration. I provided a second alternative that doesn't require to remove this column but at the end is a more verbose formula. It is a matter of personal preferences.

CodePudding user response：

Why not just employ some basic matrix multiplication?

=LET(ζ,A2:B5,ξ,ROWS(A2:B5),MMULT(N(SEQUENCE(ξ)>=SEQUENCE(,ξ)),ζ))