SQL Server query for multiple conditions on the same column-CodePudding

Here's the schema and data that i am working with

CREATE TABLE tbl (
    name varchar(20) not null,
    groups int NOT NULL
);

insert into tbl values('a', 35);
insert into tbl values('a', 36);
insert into tbl values('b', 35);
insert into tbl values('c', 36);
insert into tbl values('d', 37);

| name | groups|
|------|-------|
| a    | 35    |
| a    | 36    |
| b    | 35    |
| c    | 36    |
| d    | 37    |

now i need names of only those that are having group greater than or equal to 35 but also an additional is that i can only include a row for which group=35 when a corresponding groups=36 is also present

| name | groups|
|------|-------|
| a    | 35    |
| a    | 36    |

second condition is that it CAN include those names that are having groups greater than or equal to 36 without having a groups=35

| name | groups|
|------|-------|
| c    | 36    |
| d    | 37    |

the only case it should leave out is where a record has only groups=35 present without a corresponding groups=36

| name | groups|
|------|-------|
| b    | 35    |

i have tried the following

select name from tbl
where groups>=35
group by name
having count(distinct(groups))>=2
or groups>=36;

this is the error i am facing Column 'tbl.groups' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.

CodePudding user response：

So, as far as i can tell you just want to limit where groups 35 is by itself. I thought, lets try and isolate those names where they only have groups=35 and then not exists from there. Is this the correct output youre after?

Also, using complicated OR's in the where clause will often lead to your query not being SARGable. Better to UNION or some how building the query so that each part can use indexes (if they can).

if object_id('tempdb..#tbl') is not null drop table #tbl;
CREATE TABLE #tbl (
    name varchar(20) not null,
    groups int NOT NULL
);

insert into #tbl values('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);

select * 
from #tbl tbl
WHERE NOT EXISTS
    (
        SELECT COUNT(groups), name 
        FROM #tbl t
        WHERE EXISTS
            (
                SELECT name 
                FROM #tbl tb
                WHERE groups = 35
                and tb.name=t.name
            )
        AND t.name = tbl.name
        GROUP BY name 
        HAVING COUNT(groups)=1
    )
;

CodePudding user response：

It looks like you need an exists() condition. Try:

select *
from tbl t
where t.groups >= 35
and (
    t.groups > 35
    or exists(select * from tbl t2 where t2.name = t.name and t2.groups = 36)
)

There are other ways to arrange the where clause to achieve the same effect. Having the t.groups >= 35 condition up front should give the query optimizer the ability to leverage an index on groups.

CodePudding user response：

You can use a windowed count for this

This avoids joining the table multiple times

SELECT
  name,
  groups
FROM (
    SELECT *,
      Count36 = COUNT(CASE WHEN groups = 36 THEN 1 END) OVER (PARTITION BY name)
    FROM tbl
    WHERE groups >= 35
) tbl
WHERE groups >= 36 OR Count36 > 0;

db<>fiddle

CodePudding user response：

Try this:

DECLARE @tbl table ( [name] varchar(20) not null, groups int NOT NULL );

INSERT INTO @tbl VALUES
    ('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);

DECLARE @group int = 35;

; WITH cte AS (
    SELECT
        [name]
        , COUNT ( DISTINCT groups ) AS distinct_group_count
    FROM @tbl
    WHERE
        groups >= @group
    GROUP BY
        [name]
)
SELECT t.* FROM @tbl AS t
INNER JOIN cte
    ON t.[name] = cte.[name]
WHERE
    cte.distinct_group_count > 1
    OR t.groups > @group;

RETURNS

 ------ -------- 
| name | groups |
 ------ -------- 
| a    |     35 |
| a    |     36 |
| c    |     36 |
| d    |     37 |
 ------ --------

Basically, this restricts the name results to groups with a value >= 35 with more than one distinct group associated, or any name with a group value greater than 35. Several assumptions were made in regard to your data, but I believe the logic still applies.