Filtering out rows with NULL values only when a value already has a row with a NON NULL value-CodePudding

I am using SQL Server Management Studio 17.

I have a select statement with a group by that returns the following values. This is just a subset of 170k rows.

SELECT        child, parent
FROM            (SELECT child, parent
                          FROM table
                          GROUP BY child, parent) AS derivedtbl_1
ORDER BY child

Child	Parent
201	NULL
201	991
201	1020
202	NULL
203	NULL

I am struggling to find a select statement that filters out the first row. If a child already has a parent that is NOT NULL then I want it to filter out the row with the NULL value.

I have tried to solve it with a case when having count statement. For example if a value exists more than once in the child column then I want it to filter out the row where parent is NULL but all of my code so far returns errors.

Child	Parent
201	991
201	1020
202	NULL
203	NULL

CodePudding user response：

You may use exists logic here:

SELECT child, parent
FROM yourTable t1
WHERE
    Parent IS NOT NULL OR
    (Parent IS NULL AND
     NOT EXISTS (SELECT 1 FROM yourTable t2
                 WHERE t2.Child = t1.Child AND
                       t2.Parent IS NOT NULL));

Demo

CodePudding user response：

You can use a window function for this. It may be faster or slower than using an EXISTS self-join, you need to test

SELECT
  child,
  parent
FROM (
    SELECT
      child,
      parent,
      AnyParent = MAX(parent) OVER (PARTITION BY child)
    FROM [table]
    GROUP BY child, parent
) AS derivedtbl_1
WHERE AnyParent IS NOT NULL;
ORDER BY child