Home > Software design >  Why is it that a change in query plan suddenly seems to break a query
Why is it that a change in query plan suddenly seems to break a query

Time:07-28

I am trying to better understand a problem that occured recently in production.

A query that has been running without issues suddenly stops with the error:

Invalid length parameter passed to the LEFT or SUBSTRING function.

The query (somewhat simplified):

SELECT t.client
FROM acuhistr t
    ,acuheader h
    ,acrclient c
WHERE t.ext_inv_ref IN (
        SELECT substring(ext_inv_ref, 1, len(ext_inv_ref) - 3)
        FROM acutrans
        WHERE ext_inv_ref LIKE '%ROT'
            AND client = t.client
        )
    AND t.apar_id = h.apar_id
    AND c.client = t.client
    AND h.client = c.leg_act_cli

I can reproduce this error message by passing a negative length parameter to substring():

SELECT substring('test', 1, -1)

> Invalid length parameter passed to the substring function.

The acutrans table does contain rows thas has an empty ext_inv_ref but those should have been filtered out by the LIKE condition. If I run the subquery by itself it runs successfully.

I also tried running the full query in the dev environment and it works.

Comparing the query plans between prod and dev it seems that the prod query plan executes the SUBSTRING() before the LIKE '%ROT' condition has been applied which causes the query to fail.

Image comparing query plans between prod and dev

Am I correct in identifying the new query plan as the cause for why the query stops working? If so, why did SQL Server generate a plan that breaks the query?

SQL Server version is 2019

CodePudding user response:

Because filters and expressions may be evaluated in a different order (e.g. before or after values that break get filtered away). Think about it more simply, imagine you are a barber with a lobby full of customers:

  • cut one inch of hair from every customer
  • filter out the bald guys

vs.

  • filter out the bald guys
  • cut one inch of hair from the remaining customers

A simpler example in SQL Server is:

  • get the month name from a string
  • filter out the strings that aren't valid dates

vs.

  • filter out the strings that aren't valid dates
  • get the month name from the remaining strings which are valid dates

One way you get around this in SQL Server is to use a CASE expression to only evaluate rows that don't break the expression (or, in this example, to alter the expression so it doesn't break):

substring(ext_inv_ref, 1, len(ext_inv_ref) - CASE 
  WHEN len(ext_inv_ref) >= 3 THEN 3 ELSE 0 END)

In plain English this just says remove the right-most 3 characters but only if the string is already at least 3 characters long.

Or you can use COALESCE/NULLIF or a variety of other methods depending on whether you want an empty string or NULL when the input is < 3 characters.

Some other questions for more insight:

  • Related