Given this data:
| x | y |
| -- | ---- |
| a | 1 |
| a | null |
| b | 2 |
| b | 3 |
| b | 3 |
| b | null |
| b | null |
| c | null |
| c | null |
I want to write a query that returns this result:
| x | y |
| -- | ---- |
| a | 1 |
| b | 2 |
| b | 3 |
| c | null |
That is to say, for a given x value, if there are rows with non-null y values, include all the distinct (x value, y value) rows and omit any (x value, null) rows. But, if for the x value, all the rows have null y, then return (x value, null).
Sample data
create table t (x text, y int);
insert into t
values ('a' , 1)
, ('a' , null)
, ('b' , 2)
, ('b' , 3)
, ('b' , 3)
, ('b' , null)
, ('b' , null)
, ('c' , null)
, ('c' , null);
CodePudding user response:
Multiple ways to do this. One way using ROW_NUMBER()
:
WITH
DistinctXYWithRowNumbers AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY x ORDER BY y DESC) AS RowNumber
FROM
(SELECT DISTINCT x, y FROM t) DistinctXY
)
SELECT
x,
y
FROM
DistinctXYWithRowNumbers
WHERE
NOT (y IS NULL AND RowNumber > 1)
ORDER BY
x,
y
;
Another way using a count table and JOIN
:
WITH
Counts AS (
SELECT
x,
COUNT(DISTINCT y) AS count
FROM
t
GROUP BY
x
)
SELECT
DistinctXY.*
FROM
(SELECT DISTINCT x, y FROM t) DistinctXY
LEFT JOIN Counts ON DistinctXY.x = Counts.x
WHERE
NOT (y IS NULL AND count > 0)
;
CodePudding user response:
You may use count window function as the following:
Select Distinct x, y
From
(
Select x, y,
COUNT(Case When y Is Not null Then 1 End) Over (Partition By x) cn
From t
) D
Where y Is Not null Or cn = 0
Order By x, y
Or you may use Not Exists
with a correlated subquery as the following:
Select Distinct x, y From t As A
Where y Is Not null
Or (y Is null And Not Exists(Select 1 From t As B Where B.x = A.x And B.y Is Not null))
Order By x, y
See a demo.