Using postgres 11 I have a data table int_state
that looks like this:
dev | int |
---|---|
sw01 | gi1 |
sw01 | gi2 |
sw01 | gi3 |
sw01 | gi4 |
sw02 | gi3 |
sw02 | gi4 |
sw02 | gi5 |
sw03 | gi3 |
sw03 | gi4 |
sw03 | gi6 |
sw03 | gi7 |
I need a query that will return a single row for every instance of dev
where int
is the same across every instance of dev
. Using the sample above query should only return "gi3" as the value for int
since that value is shared between all instances of dev
and result is limited is a single row for every dev
. Desired results from sample data above:
dev | int |
---|---|
sw01 | gi3 |
sw02 | gi3 |
sw03 | gi3 |
The number of dev
instances (x) to match against will vary, but results should always contain x rows, with a row for every dev
asked for in the query. If there are multiple matches against int
the query should only return the first one found.
Using a JOIN
doesn't seem like it will work for me, since I need x rows of results, not a single row with all results combined. I have been playing around with CTEs and UNION but so far unable to get it working. I thought nesting CTEs would help, but apparently that's not supported in postgres.
I plan on constructing this query on-demand in code ahead of time, so if I have to construct something that contains where dev in ('sw01','sw02','x','y','z')
that's okay.
CodePudding user response:
WITH distinctdevs AS (SELECT DISTINCT dev FROM int_state)
SELECT distinctdevs.dev, (SELECT int FROM int_state ints WHERE NOT EXISTS
(SELECT dev FROM distinctdevs WHERE NOT EXISTS (
SELECT * FROM int_state WHERE
int_state.int = ints.int AND int_state.dev = distinctdevs.dev
)
) LIMIT 1) FROM distinctdevs;
You might actually get better performance if you don't reuse the CTE on line 3, as that might trick the optimizer into thinking this is a correlated subquery rather than a constant. The core trick here is the doubly nested WHERE NOT EXISTS
, converting "an int that appears for every dev" to the logically equivalent "an int for which there is no dev for which it does not appear."
CodePudding user response:
ok, so if I understand your question, you want to find all the instances of dev (01, 02, 03, etc), and then check that the value for int is the same for all the instances?
You can do a clever workaround I think....
---- get the count of how many instances of dev there are...
WITH distinct_dev AS (
SELECT
dev
, count(*) AS dev_count
FROM
table
GROUP BY
1
)
----- get the count of how many instances of int there are per dev/int
, dev_int AS (
SELECT
t.dev
, t.int
, COUNT(DISTINCT t2.int) as count_dev_int
FROM
original_table t
LEFT JOIN
original_table t2
ON t.dev = t2.dev
AND t.int = t2.int
GROUP BY
1, 2
)
----- then from the first CTE select unique devs, and join the 2nd CTE where the dev envs = the count of int per dev env....
SELECT
dd.dev
, di.int
FROM
distinct_dev dd
JOIN
dev_int di
ON di.dev = dd.dev
AND di.count_dev_int = dd.count_dev
CodePudding user response:
You can build CTE with an array of distinct dev
values and another with array of dev
values for each int
using array_agg. Then JOIN then on matching arrays. (see demo)
with dev_array as
( select array_agg(distinct dev) dev
from data_tab
)
, int_match as
(select int, array_agg(dev) dev
from data_tab
group by int
)
select int, unnest(dev) dev
from (select im.*
from dev_array da
join int_match im
on (im.dev = da.dev)
order by im.int
limit 1
) sq;