I am working on a multiple parent child hierarchy that looks something like this. This is a subset of a table with 105k records.
NodeKey | ParentKey | Child | Parent |
---|---|---|---|
1 | A | ||
2 | B | ||
3 | C | A | |
4 | D | B | |
5 | D | C |
I need to fill the column ParentKey with data with the following condition.
If the value in Parent is NULL
, set the value in ParentKey NULL
as well.
If the value in Parent is NOT NULL
and is also in Child, then select the corresponding NodeKey and set it as the value in ParentKey (see 2nd table).
I can do that but there is a problem when the value from the Parent column appears more than once in the Child column.
In the 5th row it doesn't matter which value is chosen between 3 or 4. It can be either one.
NodeKey | ParentKey | Child | Parent |
---|---|---|---|
1 | A | ||
2 | B | ||
3 | 1 | C | A |
4 | 2 | C | B |
5 | Doesn't matter if 3 or 4 | D | C |
SELECT (CASE WHEN Parent IS NULL THEN NULL
ELSE
(SELECT NodeKey from table WHERE Parent IN (SELECT Child from table)) END) as ParentKey
When executing this code it tells me that "Subquery returned more that 1 value" which makes sense. But regardless where I put a max()
or min()
it doesn't work.
When I put max()
in front of NodeKey it just returns a column full with NULL and 105314. 105314 is the amount of rows in the table.
I am using SQL Server Management Studio 17.
CodePudding user response:
If it is not matter what ParentKey will be used, you can use MIN (MAX) function:
SELECT
TBL.NodeKey,
PK AS ParentKey,
TBL.Child,
TBL.Parent
FROM TBL
LEFT JOIN (
SELECT Child, MIN(NodeKey) PK FROM TBL GROUP BY Child
) P ON P.Child = TBL.Parent;
or another version:
SELECT
TBL.NodeKey,
MIN(P.NodeKey) AS ParentKey,
TBL.Child,
TBL.Parent
FROM TBL
LEFT JOIN TBL P ON P.Child = TBL.Parent
GROUP BY TBL.NodeKey, TBL.Child, TBL.Parent;
Result:
========= =========== ======= ========
| NodeKey | ParentKey | Child | Parent |
========= =========== ======= ========
| 1 | (null) | A | (null) |
--------- ----------- ------- --------
| 2 | (null) | B | (null) |
--------- ----------- ------- --------
| 3 | 1 | C | A |
--------- ----------- ------- --------
| 4 | 2 | C | B |
--------- ----------- ------- --------
| 5 | 3 | D | C |
--------- ----------- ------- --------
CodePudding user response:
TOP
clause is what you are looking for, SELECT TOP 1 in your case:
WHERE Parent IN (SELECT TOP 1 Child from table)