Home > Software engineering >  Find which where clause is the most troublesome
Find which where clause is the most troublesome

Time:04-19

Problem background

I am trying to pin down to what condition(s) are causing no records / rows the most, so to allow me to find the root cause of what data in the database might need scrubbing.

So for example from the following query I would like to know whether it was the first condition which fails most of the time or second condition is the most offending one and so on.

SELECT TOP 1
FROM table 
WHERE column1 = @param1  -- (cndtn 1)This condition works without anding with other conditions
  AND column2 = @param2
  AND column3 = @param3  -- (cndtn 3) This with 1 works 10% of the time
  AND column4 = @param4

One of the ideas I thought was to break the procedure to use one condition at a time.

DECLARE @retVal int

SELECT @retVal = COUNT(*) 
FROM table
WHERE column1 = @param1

IF (@retVal > 0)
   --Do Something like above but by using @param2, @param3 and so on

Issues

  1. If first check itself fails I wouldn't have a way forward to investigate into other combinations.
  2. This doesn't seem very efficient either as this stored procedure is called hundreds of times.

Other SO Post I also find this great post (Find which one of the WHERE clauses succeeded) but this isn't very helping when no records are returned.

CodePudding user response:

If this is just for debugging, what about detecting when the @@ROWCOUNT = 0 and storing those parameters in a separate debugging table?

SELECT  TOP 1 *
FROM    SomeTable
WHERE   column1 = @param1 
AND     column2 = @param2
AND     column3 = @param3 
AND     column4 = @param4
-- order by ....
;    

-- one or more parameters "failed"
IF @@ROWCOUNT = 0
BEGIN
     INSERT INTO SomeTable_Debug( createdDate, column1, column2, column3, column4, column5)
     VALUES (getDate(), @param1, @param2, @param3, @param4, @param5)
END

You can then use the debugging table later on, in a separate query script, without having to worry about it's impact on a frequently invoked procedure. For example, this query returns 1 when a condition "fails", otherwise it returns null. It's not optimized for efficiency, but should be fine for occasional debugging use:

SELECT *
       , (SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM SomeTable st WHERE st.column1 = d.column1)) AS Matches_Column1
       , (SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM SomeTable st WHERE st.column2 = d.column2)) AS Matches_Column2
       , (SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM SomeTable st WHERE st.column3 = d.column3)) AS Matches_Column3
       , (SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM SomeTable st WHERE st.column4 = d.column4)) AS Matches_Column4
       , (SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM SomeTable st WHERE st.column5 = d.column5)) AS Matches_Column5
FROM   SomeTable_Debug d

Sample Results:

id createdDate column1 column2 column3 column4 column5 Matches_Column1 Matches_Column2 Matches_Column3 Matches_Column4 Matches_Column5
1 2022-04-18 16:51:11.487 1 22 3 4 5 null 1 null null null
2 2022-04-18 16:51:11.500 1 22 3 4 56 null 1 null null 1

db<>fiddle here

  • Related