I have 2 tables i am joining, and have created a unique field, a concatenation of 2 string fields. When trying to get a count of that unique field using count(unique_field), I get an incorrect count. My unique_field does include nulls in some of the results, which are being counted correctly. When using count(distinct unique_field) I get 1's and 0's, which is also incorrect. What is a way I can figure this count out??
SELECT DISTINCT
CONCAT(product_id,'_', channel_id ) as unique_id,
count(DISTINCT unique_id) as unique_id_count,
field c,
field d
From tablea full outer join tableb
ON tablea.product_id=tableb.product_id
group by 3,4
Example output: column A is unique_Id, Column B is actual count of unique_Id, and column c is the results of unique_id_count
CodePudding user response:
As per my understanding you want to count how many occurrences exist for an individual unique_id.
Therefore, I would suggest to group by unique_id and not by field c and field d.
SELECT DISTINCT
CONCAT(product_id,'_', channel_id ) as unique_id,
count(DISTINCT unique_id) as unique_id_count
From tablea full outer join tableb
ON tablea.product_id=tableb.product_id
group by CONCAT(product_id,'_', channel_id )
CodePudding user response:
so if we "try" make your SQL valid:
with tablea(product_id) as (
select * from values
('a'),
('b'),
('c'),
('d'),
('e')
), tableb(product_id, channel_id) as (
select * from values
('a', '1'),
('a', '2'),
('b', '3')
)
SELECT
CONCAT(NVL(a.product_id,b.product_id),'_', b.channel_id ) as unique_id
,count(DISTINCT unique_id) as unique_id_count
--field c,
--field d
From tablea as a
full outer join tableb as b
ON a.product_id = b.product_id
group by 1
gives:
UNIQUE_ID | UNIQUE_ID_COUNT |
---|---|
a_1 | 1 |
a_2 | 1 |
b_3 | 1 |
null | 0 |
Which makes sense as in this data there is only "one" thing per grouping. Now I get you had some fields c & d
so maybe like:
with tablea(product_id, field) as (
select * from values
('a', 1),
('b', 1),
('c', 2),
('d', 2),
('e', 2)
), tableb(product_id, channel_id, field) as (
select * from values
('a', '1', 10),
('a', '2', 10),
('a', '2', 10),
('b', '3', 10)
)
SELECT
CONCAT(NVL(a.product_id,b.product_id),'_', b.channel_id ) as unique_id
,count(DISTINCT unique_id) as unique_id_count
,a.field as c
,b.field as d
From tablea as a
full outer join tableb as b
ON a.product_id = b.product_id
group by c, d
but this is not valid as unique_id
is confusing the SQL, so we can inline that:
SELECT
count(DISTINCT CONCAT(NVL(a.product_id,b.product_id),'_', b.channel_id )) as unique_id_count
,a.field as c
,b.field as d
From tablea as a
full outer join tableb as b
ON a.product_id = b.product_id
group by c, d
gives:
UNIQUE_ID_COUNT | C | D |
---|---|---|
3 | 1 | 10 |
0 | 2 | null |
But what you are trying to do is somewhat confusing.