Here's the schema and data that i am working with
CREATE TABLE tbl (
name varchar(20) not null,
groups int NOT NULL
);
insert into tbl values('a', 35);
insert into tbl values('a', 36);
insert into tbl values('b', 35);
insert into tbl values('c', 36);
insert into tbl values('d', 37);
| name | groups|
|------|-------|
| a | 35 |
| a | 36 |
| b | 35 |
| c | 36 |
| d | 37 |
now i need names of only those that are having group greater than or equal to 35 but also an additional is that i can only include a row for which group=35 when a corresponding groups=36 is also present
| name | groups|
|------|-------|
| a | 35 |
| a | 36 |
second condition is that it CAN include those names that are having groups greater than or equal to 36 without having a groups=35
| name | groups|
|------|-------|
| c | 36 |
| d | 37 |
the only case it should leave out is where a record has only groups=35 present without a corresponding groups=36
| name | groups|
|------|-------|
| b | 35 |
i have tried the following
select name from tbl
where groups>=35
group by name
having count(distinct(groups))>=2
or groups>=36;
this is the error i am facing Column 'tbl.groups' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
CodePudding user response:
So, as far as i can tell you just want to limit where groups 35 is by itself. I thought, lets try and isolate those names where they only have groups=35 and then not exists from there. Is this the correct output youre after?
Also, using complicated OR's in the where clause will often lead to your query not being SARGable. Better to UNION or some how building the query so that each part can use indexes (if they can).
if object_id('tempdb..#tbl') is not null drop table #tbl;
CREATE TABLE #tbl (
name varchar(20) not null,
groups int NOT NULL
);
insert into #tbl values('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);
select *
from #tbl tbl
WHERE NOT EXISTS
(
SELECT COUNT(groups), name
FROM #tbl t
WHERE EXISTS
(
SELECT name
FROM #tbl tb
WHERE groups = 35
and tb.name=t.name
)
AND t.name = tbl.name
GROUP BY name
HAVING COUNT(groups)=1
)
;
CodePudding user response:
It looks like you need an exists() condition. Try:
select *
from tbl t
where t.groups >= 35
and (
t.groups > 35
or exists(select * from tbl t2 where t2.name = t.name and t2.groups = 36)
)
There are other ways to arrange the where clause to achieve the same effect. Having the t.groups >= 35 condition up front should give the query optimizer the ability to leverage an index on groups.
CodePudding user response:
You can use a windowed count for this
This avoids joining the table multiple times
SELECT
name,
groups
FROM (
SELECT *,
Count36 = COUNT(CASE WHEN groups = 36 THEN 1 END) OVER (PARTITION BY name)
FROM tbl
WHERE groups >= 35
) tbl
WHERE groups >= 36 OR Count36 > 0;
CodePudding user response:
Try this:
DECLARE @tbl table ( [name] varchar(20) not null, groups int NOT NULL );
INSERT INTO @tbl VALUES
('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);
DECLARE @group int = 35;
; WITH cte AS (
SELECT
[name]
, COUNT ( DISTINCT groups ) AS distinct_group_count
FROM @tbl
WHERE
groups >= @group
GROUP BY
[name]
)
SELECT t.* FROM @tbl AS t
INNER JOIN cte
ON t.[name] = cte.[name]
WHERE
cte.distinct_group_count > 1
OR t.groups > @group;
RETURNS
------ --------
| name | groups |
------ --------
| a | 35 |
| a | 36 |
| c | 36 |
| d | 37 |
------ --------
Basically, this restricts the name results to groups with a value >= 35 with more than one distinct group associated, or any name with a group value greater than 35. Several assumptions were made in regard to your data, but I believe the logic still applies.