Home > Enterprise >  How to ignore rows of common value, if there's a NULL in another column
How to ignore rows of common value, if there's a NULL in another column

Time:08-17

I'm struggling with writing a statement that would query for both:

  • all rows of one common id, which never have NULLs in two other columns
  • just those rows of one common id, which if there is a NULL in either of two other columns, query for records with a date greater or equal than the last record having a NULL in either of two columns

An example with sample data (those 'two columns' here are old_param and new_param);
empty strings are considered equivalent to NULL for the software using this DB:

example result set to work on

  • Colored in green are rows that the query should return.
  • all rows of objid 5 and 4 should be included, since both old_param and new_param never have NULLs occurring.
  • from rows of objid 6 - row #6 since it's the latest occurrence of either of new_param or old_param being NULL, so the older ones should be ignored, but the consecutive ones occurring after this row should be included.

I was thinking of splitting those two conditions into two separate queries joined by a UNION to make the task easier, but I am unsure of how to write a query that would separate those conditions by the id column.

I did, however, manage to write a query for the second of my requirements:

SELECT {various columns}
  FROM {table name   joins}
  WHERE [date] >= (
                SELECT MAX([date]) 
                FROM {table name}
                WHERE [new_param] IS NOT NULL 
                    AND ([old_param] IS NULL 
                         OR DATALENGTH([old_param]) = 0))

But I can't figure out how to combine it into one, bigger query.

CodePudding user response:

--Data to work with
DECLARE @data TABLE (date date, new_param nvarchar(100), objid int, old_param nvarchar(100))
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-10', 'hello', 5, 'world')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-10', 'hello', 4, 'world')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-11', 'hello', 6, '')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-12', 'hello', 6, 'world')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-13', '', 6, 'world')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-14', 'hello', 6, '')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-15', 'hello', 6, 'world')
INSERT INTO @data (date, new_param, objid, old_param) VALUES ('2022-08-16', 'hello', 6, 'world')


--The query you are looking for
SELECT date, new_param, objid, old_param
FROM @data d
WHERE date >= ISNULL((SELECT MAX(date) 
                      FROM @data  
                      WHERE objid = d.objid 
                          and (ISNULL(new_param, '') = '' or ISNULL(old_param, '') = '')), date) 

The subquery gets the maximum date where one of the params is null for each objid, if there are no such entries the subquery will return null which is then checked by the ISNULL condition and takes the right date. This is the result I'm getting from what I understood from the question.

enter image description here

  • Related