Home > front end >  Excel Groupby and Sum Returning Error due to blank rows
Excel Groupby and Sum Returning Error due to blank rows

Time:01-26

I have a dataset with the following structure

email               category
[email protected]     action
[email protected]    horror
[email protected]   romance
[email protected]     sci-fi
[email protected]   action
[email protected]    action
[email protected]
[email protected]    
[email protected]  sci-fi
[email protected]    horror       
[email protected]                 

and am trying to produce a column 'sum_category' that would group by email to get the sum of unique category values for each unique email like this

email               category     sum_category
[email protected]     action       3
[email protected]    horror       1
[email protected]   romance      2
[email protected]     sci-fi       3
[email protected]   action       2
[email protected]    action       1
[email protected]                 1
[email protected]                 1
[email protected]  sci-fi       1
[email protected]    horror       3
[email protected]                 3

Assuming Column A is 'Email' and Column C is 'category', I have tried using the following code but it is producing #DIV/0! error

SUMPRODUCT((A$2:A$100=A2)/COUNTIFS(C$2:C$100,C$2:C$100,A$2:A$100,A$2:A$100))

I have also tried using the following code but it is counting the blank values into the sum

SUMPRODUCT((A$2:A$100=A2)/COUNTIFS(C$2:C$100,C$2:C$100&"",A$2:A$100,A$2:A$100))

Appreciate any help I could get to resolve this, thank you.

CodePudding user response:

I presume you don't have Office 365, else this would be straightforward, for example using the following spill formula:

=LET(
    α, A2:A12,
    β, C2:C12,
    BYROW(α, LAMBDA(ζ, COUNTA(UNIQUE(FILTER(β, (α = ζ) * (β <> ""))))))
)

Otherwise, don't use the reciprocal COUNTIF/SUMPRODUCT set-up: it's incredibly slow compared to the following FREQUENCY/MATCH construction, in D2:

=SUM(
    IF(FREQUENCY(
        IF(A$2:A$12 = A2,
            IF(C$2:C$12 <> "", MATCH(C$2:C$12, C$2:C$12, 0))
            ),
            ROW(A$2:A$12) - MIN(ROW(A$2:A$12))   1
        ), 
      1
    )
)

and then copied down to D12.

  • Related