Home > Net >  Get count of combinations of all available cases in a column in SQL
Get count of combinations of all available cases in a column in SQL

Time:12-15

I have a table like below. Members can have purchases under mutliple categories. I want to get all category combinations for members who have had purchases in more than one categeory. That is, I want to get all combinations and their counts. For example, members A,B,C,F have had purhcase in three categories; member D only in one, and member E is in two categories.

 category  member  quantity
 Clothes     A          1 
 Clothes     B          2 
 Clothes     C          3 
 Clothes     D          1 
 Cards       A          1 
 Cards       B          1 
 Cards       C          2 
 Cards       E          3 
 Cards       F          3 
 Trips       A          1 
 Trips       B          2 
 Trips       C          2 
 Trips       F          1 
 Dining      E          2
 Dining      F          1 

The output I want is like this:

       Categories             mem    quant_total   
Clothes    Cards    Trips      3         15             ==> members A B C
Cards      Trips    Dining     1         5              ==> Only member F

Cards      Dining              1         5              ==> Only member E

I can get count of categories for members who appeared in multiple categories, but don't know how to get the above results.

SELECT category, count(category)
FROM table
WHERE member IN  (SELECT member FROM table
                    GROUP BY member 
                    HAVING count(distinct category) > 1 ) 
GROUP BY category

Plz note that category combinations can be larger like 4,5,6..., and this is a toy example.

CodePudding user response:

We'll start with the data.

Please provide the DDL for this next time. It is friendlier to those providing you the free labor.

with "table" (category, member, quantity) as (
        select 'Clothes', 'A', 1
  union select 'Clothes', 'B', 2
  union select 'Clothes', 'C', 3
  union select 'Clothes', 'D', 1
  union select 'Cards', 'A', 1
  union select 'Cards', 'B', 1
  union select 'Cards', 'C', 2
  union select 'Cards', 'E', 3
  union select 'Cards', 'F', 3
  union select 'Trips', 'A', 1
  union select 'Trips', 'B', 2
  union select 'Trips', 'C', 2
  union select 'Trips', 'F', 1
  union select 'Dining', 'E', 2
  union select 'Dining', 'F', 1
),

Then we aggregate the cagetories and get the counts

For PostgreSQL:

counts as (
  select member as mem
  , STRING_AGG ( category, ', ' order by category ) as categories
  , count(distinct category) as category_count
  , sum(quantity) as quant_total
  from "table"
  group by member
)

or for Redshift:

counts as (
  select member as mem
  , LISTAGG( category , ',' ) WITHIN GROUP (ORDER BY category) as categories
  , count(distinct category) as category_count
  , sum(quantity) as quant_total
  from "table"
  group by member
)

Then we get the results:

select categories
, count(distinct mem) as mem
, sum(quant_total) as quant_total
from counts
where category_count > 1
group by categories
order by 1

I couldn't test the redshift code, but I tested this on an Oracle fiddle and it works.

with "table" (category, member, quantity) as (
        select 'Clothes', 'A', 1 from dual
  union select 'Clothes', 'B', 2 from dual
  union select 'Clothes', 'C', 3 from dual
  union select 'Clothes', 'D', 1 from dual
  union select 'Cards', 'A', 1 from dual
  union select 'Cards', 'B', 1 from dual
  union select 'Cards', 'C', 2 from dual
  union select 'Cards', 'E', 3 from dual
  union select 'Cards', 'F', 3 from dual
  union select 'Trips', 'A', 1 from dual
  union select 'Trips', 'B', 2 from dual
  union select 'Trips', 'C', 2 from dual
  union select 'Trips', 'F', 1 from dual
  union select 'Dining', 'E', 2 from dual
  union select 'Dining', 'F', 1 from dual
),
counts as (
  select member as mem
  , listagg ( category, ', ') WITHIN GROUP (ORDER BY category ) as categories
  , count(distinct category) as category_count
  , sum(quantity) as quant_total
  from "table"
  group by member
)
select categories
, count(distinct mem) as mem
, sum(quant_total) as quant_total
from counts
where category_count > 1
group by categories
order by 1

CodePudding user response:

OK Here's my solution:

SELECT Categories, count(member), sum(memcat_total) 
FROM (
        SELECT  group_concat(category ORDER BY category SEPARATOR " " ) AS Categories, 
                member,
                sum(quantity) as memcat_total
        FROM table
        GROUP BY member
        ORDER by categories
    ) AS SQL1
GROUP BY Categories;

Which gives the following:

enter image description here

EDIT SQL Modified to Postgres syntax (untested)

SELECT Categories, count(member), sum(memcat_total) 
FROM (
        SELECT  ARRAY_AGG(category || ' ' ORDER BY category) AS Categories, 
                member,
                sum(quantity) as memcat_total
        FROM table
        GROUP BY member
        ORDER by categories
    ) AS SQL1
GROUP BY Categories;

EDIT 2: My attempt at a Redshift version:

SELECT Categories, count(member), sum(memcat_total) 
FROM (
        SELECT  LISTAGG(category,' ') within group (order by  category) AS Categories, 
                member,
                sum(quantity) as memcat_total
        FROM table
        GROUP BY member
        ORDER by categories
    ) AS SQL1
GROUP BY Categories;
  • Related