I have data that looks like this:
Column1 | Column2 |
---|---|
value1 | x |
value1 | x |
value1 | (null) |
value2 | y |
value2 | y |
value3 | (null) |
value3 | (null) |
I want to Count Distinct values in Column1 and Group by Column 2 where it ignore nulls unless null is the only value. expected result would be:
Column2 | Count Distinct Column1 |
---|---|
x | 1 |
y | 1 |
(null) | 1 |
any help is appreciated. thanks
CodePudding user response:
I want to Count Distinct values in Column1 and Group by Column 2 where it ignore nulls unless null is the only value
I assume you mean to "ignore rows where column2
is null, unless the column1
value does not appear in other rows" - like the "value3" that appears only in rows where column2
is null?
You can do such filtering like this:
SELECT
count(DISTINCT column1) FILTER (
WHERE column2 IS NOT NULL OR NOT EXISTS(
SELECT * FROM example i WHERE i.column1 = o.column1 AND i.column2 IS NOT NULL
)
),
column2
FROM example o
GROUP BY column2;
CodePudding user response:
OP here.
I was able to answer my question. I ended up using a union between 2 queries to get what I was looking for. the first query counts the non-null groups and the second counts the null group but with a CTE to exclude values already counted in the first query.
code I used below:
select count (distinct column1), column2
from table
where column2 is not null
group by column2
union
select count (distinct column1), column2
from table
where column1 not in (
select distinct column1 from table
where column2 is not null)
and column2 is null
group by column2
Thanks for the input!