Number of records with different column values across two different groups-CodePudding

I have the following table in postgresql lets call it table1

entity id	entity group	value
1	A	5000
2	A	4000
3	A	3000
4	A	2000
5	A	1000
1	B	5000
2	B	4800
3	B	2700

I need to find the number of entities that have different values across different entity groups.
In the above example the answer will be 2 (entity_id: 2,3).

I have a hacky way of doing this as follows

with entity_diff as (
    with entitya as (
        select entity_id,
               value as value_group_a
        from table1
        where entity_group = 'A'
    ),
         entityb as (
             select entity_id,
                    value as value_group_b
             from table1
             where entity_group = 'B'
         )
    select entitya.entity_id,
           entitya.value_group_a - entityb.value_group_b as value_diff
    from entitya
             inner join entityb on entitya.entity_id = entityb.entity_id
)
select count(*) from from entity_diff
where abs(entity_diff.value_diff) > 0;

Is there a simpler way of getting to this answer that can also scale say when I need to compare 3 or 4 groups.

CodePudding user response：

You may try the following approaches which uses:

a group by with a having clause
self-join on similar entity_ids but different entitygroups that have different values.

Query #1

select 
    count(1)
from (
    select
        entityid
    from
        table1
    group by
        entityid
    having
        count(distinct entitygroup) > 1 and
        min(value) <> max(value)
) t1;

count
2

Query #2

select
    entityid
from
    table1
group by
    entityid
having
    count(distinct entitygroup) > 1 and
    min(value) <> max(value);

entityid
2
3

Query #3

select
    count(distinct t1.entityid)
from
    table1 t1
inner join
    table1 t2 on t1.entityid = t2.entityid and
                 t1.entitygroup < t2.entitygroup and
                 t1.value <> t2.value;

count
2

Query #4

select
    count(distinct t1.entityid)
from
    table1 t1
inner join
    table1 t2 on t1.entityid = t2.entityid and
                 t1.entitygroup < t2.entitygroup and
                 abs(t1.value - t2.value)>0;

count
2

Query #5

select distinct
    t1.entityid
from
    table1 t1
inner join
    table1 t2 on t1.entityid = t2.entityid and
                 t1.entitygroup < t2.entitygroup and
                 abs(t1.value - t2.value) > 0;

entityid
2
3

View working demo on DB Fiddle

CodePudding user response：

You can use count(distinct) with a subquery:

select count(*) from (select t1.id, count(distinct t1.value) h from table1 t1 group by t1.id) t2 
where t2.h = (select count(*) from table1 t3 where t3.id = t2.id) and t2.h > 1;

Output:

count
-----
2

And the corresponding entity IDs are (select t2.id from ...):

id
-----
2
3