Home > Back-end >  How to deal with SQL IN Operator returning multiple values when only one is needed?
How to deal with SQL IN Operator returning multiple values when only one is needed?

Time:11-24

I am working on a multiple parent child hierarchy that looks something like this. This is a subset of a table with 105k records.

NodeKey ParentKey Child Parent
1 A
2 B
3 C A
4 D B
5 D C

I need to fill the column ParentKey with data with the following condition.

If the value in Parent is NULL, set the value in ParentKey NULL as well.

If the value in Parent is NOT NULL and is also in Child, then select the corresponding NodeKey and set it as the value in ParentKey (see 2nd table).

I can do that but there is a problem when the value from the Parent column appears more than once in the Child column.

In the 5th row it doesn't matter which value is chosen between 3 or 4. It can be either one.

NodeKey ParentKey Child Parent
1 A
2 B
3 1 C A
4 2 C B
5 Doesn't matter if 3 or 4 D C
SELECT (CASE WHEN Parent IS NULL THEN NULL 

ELSE 

(SELECT NodeKey from table WHERE Parent IN (SELECT Child from table)) END) as ParentKey

When executing this code it tells me that "Subquery returned more that 1 value" which makes sense. But regardless where I put a max() or min() it doesn't work.

When I put max() in front of NodeKey it just returns a column full with NULL and 105314. 105314 is the amount of rows in the table.

I am using SQL Server Management Studio 17.

CodePudding user response:

If it is not matter what ParentKey will be used, you can use MIN (MAX) function:

SELECT 
    TBL.NodeKey,
    PK AS ParentKey,
    TBL.Child, 
    TBL.Parent
FROM TBL
LEFT JOIN (
    SELECT Child, MIN(NodeKey) PK FROM TBL GROUP BY Child
) P ON P.Child = TBL.Parent;

Test MS SQL query

or another version:

SELECT 
    TBL.NodeKey,
    MIN(P.NodeKey) AS ParentKey,
    TBL.Child, 
    TBL.Parent
FROM TBL
LEFT JOIN TBL P ON P.Child = TBL.Parent
GROUP BY TBL.NodeKey, TBL.Child, TBL.Parent;

Result:

 ========= =========== ======= ======== 
| NodeKey | ParentKey | Child | Parent |
 ========= =========== ======= ======== 
| 1       | (null)    | A     | (null) |
 --------- ----------- ------- -------- 
| 2       | (null)    | B     | (null) |
 --------- ----------- ------- -------- 
| 3       | 1         | C     | A      |
 --------- ----------- ------- -------- 
| 4       | 2         | C     | B      |
 --------- ----------- ------- -------- 
| 5       | 3         | D     | C      |
 --------- ----------- ------- -------- 

CodePudding user response:

TOP clause is what you are looking for, SELECT TOP 1 in your case:

WHERE Parent IN (SELECT TOP 1 Child from table)

  • Related