I have a table that looks like this:
|FileID| File Info |
| ---- | ------------ |
| 1 | X |
| 1 | Y |
| 2 | Y |
| 2 | Z |
| 2 | A |
I want to aggregate by FileID and split the File Info column into 2 separate count columns. I want 1 column to have the count of the Unique File Info and the other to be a count of non-Unique file info.
The result would ideally look like this:
|FileID| Count(Unique)| Count(Non-unique) |
| ---- | ------------ | ----------------- |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
where the non-unique count is the 'Y' and the unique count is from the 'X' and 'Z, A' for FileID 1 and 2 respectively.
I'm looking for ways to gauge uniqueness between files rather than within.
CodePudding user response:
First you select the "Non Unique" rows from the table
SELECT FileInfo
FROM sometableyoudidnotname
GROUP BY FileInfo
HAVING COUNT(*) > 1
Now that you know which ones are unique and non unique you can left join to that table to get the "status" and count it up.
SELECT base.FileID,
SUM(CASE WHEN u.FileID is NOT NULL THEN 1 ELSE 0 END) as nonunique,
SUM(CASE WHEN u.FileID is NULL THEN 1 ELSE 0 END) as unique
FROM sometableyoudidnotname base
LEFT JOIN (
SELECT FileInfo
FROM sometableyoudidnotname
GROUP BY FileInfo
HAVING COUNT(*) > 1
) u ON base.FileInfo = u.FileInfo
GROUP BY base.FileID
CodePudding user response:
Use COUNT()
window function in every row to check if FileInfo
is unique and then use conditional aggregation to get the results that you want:
SELECT FileID,
COUNT(CASE WHEN counter = 1 THEN 1 END) count_unique,
COUNT(CASE WHEN counter > 1 THEN 1 END) count_non_unique
FROM (
SELECT t.*, COUNT(*) OVER (PARTITION BY t.FileInfo) counter
FROM tablename t
) t
GROUP BY FileID;
See the demo.
CodePudding user response:
Have a derived table that counts occurrences of each fileid. JOIN
and GROUP BY
:
select t1.FileID,
sum(case when t2.ficount = 1 then 1 else 0 end),
sum(case when t2.ficount > 1 then 1 else 0 end)
from tablename t1
join
(
select fileinfo, count(*) ficount
from tablename
group by fileinfo
) t2
on t1.fileinfo = t2.fileinfo
group by t1.FileID