Home > Net >  Excel - How to count unique records with specified condition in a column?
Excel - How to count unique records with specified condition in a column?

Time:01-06

I have a table like following:

enter image description here

And I want to count for each store, how many bills purchased single Item ? How many purchased 2 and full set of items ? The result should look like below

enter image description here

Where in store 'm3', bill 300 bought 2 item C but only count for 1 in the final result. This is where I struggle with, since I tried COUNTIFS(A:A, "=m1", B:B, "=A") (then add with count for B and C) to get single item for store m1 but unable to figure out how to distinguish with the unique bill numbers.

Please don't mind asking if needed more clarifications, and I do prefer excel in-built functions rather than VBA.

CodePudding user response:

Using Office 365 formula we start with the basic formula to find the singles:

=BYROW(E2:E4,
    LAMBDA(z,SUM(--(
        BYROW(--(COUNTIFS(A:A,z,C:C,UNIQUE($C$2:$C$8),B:B,{"A","B","C"})>0),
            LAMBDA(a,SUM(a)))=1))))

This will give you the single results:

enter image description here

The others are all variations of that formula:

A,B

=BYROW(E2:E4,
    LAMBDA(z,SUM(--(
        BYROW(--(COUNTIFS(A:A,z,C:C,UNIQUE($C$2:$C$8),B:B,{"A","B"})>0),
            LAMBDA(a,SUM(a)))=2))))

B,C

=BYROW(E2:E4,
    LAMBDA(z,SUM(--(
        BYROW(--(COUNTIFS(A:A,z,C:C,UNIQUE($C$2:$C$8),B:B,{"C","B"})>0),
            LAMBDA(a,SUM(a)))=2))))

A,C

=BYROW(E2:E4,
    LAMBDA(z,SUM(--(
        BYROW(--(COUNTIFS(A:A,z,C:C,UNIQUE($C$2:$C$8),B:B,{"C","A"})>0),
            LAMBDA(a,SUM(a)))=2))))

All

=BYROW(E2:E4,
    LAMBDA(z,SUM(--(
        BYROW(--(COUNTIFS(A:A,z,C:C,UNIQUE($C$2:$C$8),B:B,{"A","B","C"})>0),
            LAMBDA(a,SUM(a)))=3))))

enter image description here

With Older Versions of Excel we need to do some gymnastics with SUMPRODUCT, MMULT, INDEX, MODE.MULT, etc

=SUMPRODUCT(
    --(MMULT(
        --(COUNTIFS(A:A,E2,C:C,INDEX(C:C,N(IF({1},MODE.MULT(IF(MATCH($C$2:$C$8,C:C,0)=ROW($C$2:$C$8),ROW($C$2:$C$8)*{1,1}))))),B:B,{"A","B","C"})>0),
        {1;1;1})=1))
=SUMPRODUCT(
    --(MMULT(
        --(COUNTIFS(A:A,E2,C:C,INDEX(C:C,N(IF({1},MODE.MULT(IF(MATCH($C$2:$C$8,C:C,0)=ROW($C$2:$C$8),ROW($C$2:$C$8)*{1,1}))))),B:B,{"A","B"})>0),
        {1;1})=2))
=SUMPRODUCT(
    --(MMULT(
        --(COUNTIFS(A:A,E2,C:C,INDEX(C:C,N(IF({1},MODE.MULT(IF(MATCH($C$2:$C$8,C:C,0)=ROW($C$2:$C$8),ROW($C$2:$C$8)*{1,1}))))),B:B,{"B","C"})>0),
        {1;1})=2))
=SUMPRODUCT(
    --(MMULT(
        --(COUNTIFS(A:A,E2,C:C,INDEX(C:C,N(IF({1},MODE.MULT(IF(MATCH($C$2:$C$8,C:C,0)=ROW($C$2:$C$8),ROW($C$2:$C$8)*{1,1}))))),B:B,{"A","C"})>0),
        {1;1})=2))
=SUMPRODUCT(
    --(MMULT(
        --(COUNTIFS(A:A,E2,C:C,INDEX(C:C,N(IF({1},MODE.MULT(IF(MATCH($C$2:$C$8,C:C,0)=ROW($C$2:$C$8),ROW($C$2:$C$8)*{1,1}))))),B:B,{"A","B","C"})>0),
        {1;1;1})=3))

Each of these would be placed in the first row of their respective columns and confirmed as array formula by using Ctrl-Shift-Enter instead of Enter when exiting edit mode.

They they would be dragged/copied down the columns.

CodePudding user response:

Here, another alternative that generates the entire output for all cases with one formula. Use the following formula in cell E2:

=LET(ms, UNIQUE(A2:A8), ux, UNIQUE(A2:C8), CALC, LAMBDA(arr,cnt, BYROW(ms,
  LAMBDA(m, LET(subset, CHOOSECOLS(FILTER(ux, INDEX(ux,,1)=m),2,3),
   C, INDEX(subset,,2), SUM(BYROW(MMULT(IF(TOROW(C)=UNIQUE(C),1,0),
   TRANSPOSE(IF(TOROW(INDEX(subset,,1))=arr,1,0))),
   LAMBDA(r, N(SUM(r)=cnt)))))))),
  HSTACK(ms, CALC({"A";"B";"C"}, 1), CALC({"A";"B"},2), CALC({"B";"C"},2),
    CALC({"C";"A"},2), CALC({"A";"B";"C"}, 3)))

Here is the output: excel output

We create a user LAMBDA function CALC with input argument arr (items values) and cnt (count condition to check), so we can generate all possible output changing the input parameters via HSTACK function.

CALC uses ux name, that represents the input removing duplicated row, such as the combination: {m3,C,300}. Now we cannot use RACON functions, because we need to work with an array, therefore in order to do the count for Item and Bill column values, we use MMULT function combined with IF statement as follow:

MMULT(IF(TOROW(C)=UNIQUE(C),1,0),TRANSPOSE(IF(TOROW(INDEX(subset,,1))=arr,1,0))

The output of MMULT, on each row (unique bill numbers) has the occurrences of each arr value. Therefore if we do the sum by row (inner BYROW) and check against the number of counts we are looking for (cnt), we get the expected counts we are looking for.

The rest is just to invoke CALC for all cases and append by column via HSTACK.

  • Related