Home > Enterprise >  Don't select rows where column A is duplicated AND any row of column B is a specific value
Don't select rows where column A is duplicated AND any row of column B is a specific value

Time:07-07

I'm working on generating a report merging multiple tables. The report requires only showing projects that did not have any document marked 'Not Received' These document markings are listed in a table that lists each document in an individual line. So when merged into my other table it creates multiple rows of the same project. For example the following table

Project Number ChecklistValue
565 Received
565 Not Received
465 Received
465 Not Applicable

As you can see really only two projects are listed on this table but the desired output is:

Project Number Other Info
465 etc

I do not need the checklist value on the actual report, so I can use the GROUP BY to combine all the good rows, but where I have an Issue is that would still include project 565 even if I include something like where ChecklistValue <> 'Not Received', 565 needs to be hidden from the report entirely because any row for 565 contains 'Not Received'.

So that's my actual question, how do I exclude all project numbers rows that have any row containing 'Not Received'?

I'm adding the entire query will generalized names below:

SELECT 
Project Number
,Name
,Contractor
,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod
,S.NoteDate
,S.FinalAppDate
,Status
,S.ONE
,S.TWO
,S.THREE
,S.FOUR
,CH.ChecklistValue
FROM [DB1] A
INNER JOIN [DB2] C ON A.Contractor = C.Contractor
INNER JOIN [DB3] S ON A.AppID = S.AppID
INNER JOIN [DB4] LS ON S.StatusID = LS.StatusID
LEFT OUTER JOIN [DB5] CH ON A.AppID = CH.AppID AND CH.OtherID = 1
WHERE C.TypeID = 4 AND A.YEAR = 2022, AND S.THING = 1 AND
(CH.CheckListValue IS NULL OR A.AppID NOT IN (SELECT * FROM [DB5] WHERE
CheckListValue = 'Not Reveived'))
GROUP BY Project Number,Name,Contractor,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod,S.NoteDate,S.FinalAppDate,Status,S.ONE,S.TWO,S.THREE,S.FOUR

The last portion of the WHERE clause was added from a suggestion, but I'm clearly not implementing it correctly as it errors

CodePudding user response:

You can use not in like:

create table test(
  num int,
  description varchar(20)
);
insert into test(num,description)
values(565,'Received'),
(565,'Not Received'),
(465,'Received'),
(465,'Not Applicable');

select *
from test
where num not in
(
  select num -- Only select one column here
  from test
  where description = 'Not Received'
);

Results:

 ----- --------------- 
| num | description   |
 ----- --------------- 
| 465 | Received      |
| 465 | Not Applicable|
 ----- --------------- 

db<>fiddle this is on sql-server but works on other dbms as well.

So in your query you should have (in my understanding):

OR A.AppID NOT IN 
(
    SELECT AppID -- Not select *
    FROM [DB5] 
    WHERE CheckListValue = 'Not Reveived'
)

Other way to do it is with a cte but it is complicated at first glance:

with x as(
  select num
  from test
  where description = 'Not Received'
)
select t.num, t.description
from test t
left join x
on t.num = x.num
where x.num is null

I'm first creating a cte on the num column where the description = not received then I'm selecting all from the test table, and I'm left joining to the cte but I'm only selecting the num column that are not in the cte by using where x.num is null, and this will only return 465.

Now which one is better? I don't know sometimes join would be faster and sometimes in, for more you can find on this post.

  • Related