In SQL Server, I have the following table:
CREATE TABLE x
(
ColumnA CHAR NOT NULL
, ColumnB CHAR NOT NULL
, ColumnC INT NOT NULL
, CONSTRAINT CompositeKey PRIMARY KEY (ColumnA, ColumnB)
);
With the following data:
INSERT INTO x
VALUES
('A', 'X', 2)
,('A', 'Y', 1)
,('B', 'X', 4)
,('B', 'Z', 2)
,('C', 'X', 9)
,('C', 'Y', 3)
,('C', 'P', 2)
,('D', 'X', 6)
,('D', 'Y', 4)
,('E', 'P', 2)
I am writing a query with the following 2 predicates:
Count(ColumnA) > 1
Count(ColumnB) > 2
MY QUERY
SELECT *
FROM
(
SELECT *
,COUNT(*) OVER (PARTITION BY [ColumnA]) AS columnA_cnt
,COUNT(*) OVER (PARTITION BY [ColumnB]) AS columnB_cnt
FROM @DataSource
) DS
WHERE columnA_cnt >= 2
AND columnB_cnt >= 3
EXPECTED RESULT
A | B | C |
---|---|---|
A | X | 2 |
A | Y | 1 |
C | X | 9 |
C | Y | 3 |
D | X | 6 |
D | Y | 4 |
ACTUAL RESULT
A | B | C |
---|---|---|
A | X | 2 |
A | Y | 1 |
B | X | 4 |
C | X | 9 |
C | Y | 3 |
D | X | 6 |
D | Y | 4 |
CodePudding user response:
This query meet the requirement with your expected result
CREATE TABLE x
(
ColumnA CHAR NOT NULL
, ColumnB CHAR NOT NULL
, ColumnC INT NOT NULL
, CONSTRAINT CompositeKey PRIMARY KEY (ColumnA, ColumnB)
);
INSERT INTO x
VALUES
('A', 'X', 2)
,('A', 'Y', 1)
,('B', 'X', 4)
,('B', 'Z', 2)
,('C', 'X', 9)
,('C', 'Y', 3)
,('C', 'P', 2)
,('D', 'X', 6)
,('D', 'Y', 4)
,('E', 'P', 2)
with cte as (
select * from x
where ColumnA in (select ColumnA from x group by ColumnA having count(ColumnA) >1)
and ColumnB in (select ColumnB from x group by ColumnB having count(ColumnA) >2)
)
select * from cte
where ColumnA in (select ColumnA from cte group by ColumnA having count(ColumnA) >1)
and ColumnB in (select ColumnB from cte group by ColumnB having count(ColumnA) >2)
CodePudding user response:
Will this work for you (with a second condition columnA > 1
, columnB > 2
) ?:
select
ColumnA
,ColumnB
,ColumnC
from (
select
ColumnA
,ColumnB
,ColumnC
,COUNT(ColumnA) OVER (PARTITION BY ColumnA) AS columnA_cnt
,COUNT(ColumnB) OVER (PARTITION BY ColumnB) AS columnB_cnt
from (
SELECT *
,COUNT(ColumnA) OVER (PARTITION BY ColumnA) AS columnA_cnt
,COUNT(ColumnB) OVER (PARTITION BY ColumnB) AS columnB_cnt
FROM x
) ds where columnA_cnt >= 2 and columnB_cnt >= 3
) ds2 where columnA_cnt > 1 and columnB_cnt > 2
order by columnA, columnB;
CodePudding user response:
(ColumnA, ColumnB)
is the primary key. Hence
COUNT([ColumnA]) OVER (PARTITION BY [ColumnB]) AS columnA_cnt
is the same as
COUNT(*) OVER (PARTITION BY [ColumnB]) AS columnA_cnt
You look at a row's ColumnB
and count how many times it occurs in the table. So a better name than columnA_cnt
would probably be columnB_cnt
;-)
SELECT *
FROM
(
SELECT *
,COUNT(*) OVER (PARTITION BY [ColumnA]) AS columnA_cnt
,COUNT(*) OVER (PARTITION BY [ColumnB]) AS columnB_cnt
FROM @DataSource
) DS
WHERE columnA_cnt >= 2
AND columnB_cnt >= 3