Home > Net >  SQL Server Optimizer removes ISNULL calls
SQL Server Optimizer removes ISNULL calls

Time:11-12

The following query runs into an error with SQL Server 2017:

;with locations(RowNum, Latitude, Longitude) as (
    select 1, 12.3456, 45.6789
),
locationsWithPrevious as (
    select  *,
            PreviousLatitude = lag(l.Latitude) over(order by l.RowNum),
            PreviousLongitude = lag(l.Longitude) over(order by l.RowNum)
    from    locations as l
),
locationsWithDistance as (
    select  *,
            Distance = geography::Point(l.Latitude, l.Longitude, 4326).STDistance(geography::Point(l.PreviousLatitude, l.PreviousLongitude, 4326))
    from    locationsWithPrevious as l
    where   PreviousLatitude is not null
        and PreviousLongitude is not null
)
select  *
from    locationsWithDistance as l
where   Distance > 0

Msg 6569, Level 16, State 1, Line 1
'geography::Point' failed because parameter 1 is not allowed to be null.

Reason: The predicate "Distance > 0" is executed before filtering PreviousLatitude/-Longitude to IS NOT NULL. So far so good, because T-SQL is declarative and the order of operations here can be determined by the SQL Server. If you remove the predicate "Distance > 0", the query works without errors.

But now I would expect that the NULL values of the parameters can be prevented by using ISNULL() as follows:

Distance = geography::Point(l.Latitude, l.Longitude, 4326).STDistance(geography::Point(isnull(l.PreviousLatitude, 0), isnull(l.PreviousLongitude, 0), 4326))

But the query still returns the same error! The ISNULL() function is also not listed anywhere in the filter predicate of the execution plan!

Is this behaviour of the SQL Server correct? It looks to me like the SQL Server is incorrectly removing the ISNULL() calls due to the IS NOT NULL filtering.

Notes:

  1. When the "IS NOT NULL" conditions are removed, the error disappears because the ISNULL() function is now used in the filter predicate as expected (but the query has changed semantically, of course):

     locationsWithDistance as (
         select  *,
                 Distance = geography::Point(l.Latitude, l.Longitude, 4326).STDistance(geography::Point(isnull(l.PreviousLatitude, 0), isnull(l.PreviousLongitude, 0), 4326))
         from    locationsWithPrevious as l
     )
    
  2. However, the query works correctly if you replace the ISNULL() calls with CASE WHEN operations:

     Distance = geography::Point(l.Latitude, l.Longitude, 4326).STDistance(geography::Point(case when l.PreviousLatitude is not null then l.PreviousLatitude else 0 end, case when l.PreviousLongitude is not null then l.PreviousLongitude else 0 end, 4326))
    
  3. I also realise that the query can be better formulated as follows, by instantiating the Point directly in the base query:

     ;with locations(RowNum, GeoPosition) as (
         select 1, geography::Point(12.3456, 45.6789, 4326)
     ),
     locationsWithPrevious as (
         select  *,
                 PreviousGeoPosition = lag(l.GeoPosition) over(order by l.RowNum)
         from    locations as l
     ),
     locationsWithDistance as (
         select  *,
                 Distance = l.GeoPosition.STDistance(l.PreviousGeoPosition)
         from    locationsWithPrevious as l
         where   PreviousGeoPosition is not null
     )
     select  *
     from    locationsWithDistance as l
     where   Distance > 0
    

CodePudding user response:

It does indeed appear to be a bug. The compiler thinks the value is provably not null, and removes the ISNULL. However, COALESCE is not affected in the same way, it compiles down to CASE and the compiler doesn't have as much visibility on it.

The compiler puts intermediate calculations into Compute Scalar operators. But these can be calculated at different points by the Expression Service, and therefore the ISNULL should not have been removed.

One workaround, as you have found, is to remove the WHERE.

Another one is to use the extra parameter on LAG to add a default

;with locations(RowNum, Latitude, Longitude) as (
    select 1, 12.3456, 45.6789
),
locationsWithPrevious as (
    select  *,
            PreviousLatitude = lag(l.Latitude, 1, 0) over(order by l.RowNum),
            PreviousLongitude = lag(l.Longitude, 1, 0) over(order by l.RowNum)
    from    locations as l
),
locationsWithDistance as (
    select  *,
            Distance = geography::Point(l.Latitude, l.Longitude, 4326).STDistance(geography::Point(l.PreviousLatitude, l.PreviousLongitude, 4326))
    from    locationsWithPrevious as l
)
select  *
from    locationsWithDistance as l
where   Distance > 0

db<>fiddle

CodePudding user response:

I think this is symptomatic of chaining too much logic via common table expression. I often find myself breaking chains via #hash tables like this simply because it's easier to debug, and a lot easier to understand.

drop table if exists #locationsWithPrevious;
;with locations(RowNum, Latitude, Longitude) as (
    select 1, 12.3456, 45.6789
),
locationsWithPrevious as (
    select  *,
            PreviousLatitude = lag(l.Latitude) over(order by l.RowNum),
            PreviousLongitude = lag(l.Longitude) over(order by l.RowNum)
    from    locations as l
)
    select *
    into #locationsWithPrevious
    from locationsWithPrevious
    where   PreviousLatitude is not null
        and PreviousLongitude is not null;

with locationsWithDistance as (
    select  *,
            Distance = geography::Point(l.Latitude, l.Longitude, 4326).STDistance(geography::Point(l.PreviousLatitude, l.PreviousLongitude, 4326))
    from    #locationsWithPrevious as l
)
select  *
from    locationsWithDistance as l
where   Distance > 0;
  • Related