Home > Back-end >  Filling nulls with last not null value in SQL Server on Postal Codes
Filling nulls with last not null value in SQL Server on Postal Codes

Time:11-24

I have two tables PostalCodes (with one column with values from 00-00 to 99-999) and Customers (which has, beside all the customer's data, a postal code and ID of employee which is serving the customer).

So these two I am simply joining via postal code:

SELECT DISTINCT
    KP.postal,
    K.IDemp
FROM
    PostalCodes KP 
LEFT JOIN
    [Customers] K ON K.postal = KP.postal

and I'm getting this:

| postal | IDemp |
 -------- ------- 
| 00-000 | NULL  |
| 00-001 | NULL  |
| 00-001 | 12PH  |
| 00-002 | NULL  |
| 00-003 | NULL  |
| 00-004 | NULL  |
| 00-004 | 10PH  |
| 00-005 | NULL  |
| ...    | ...   |

So as you can see not all postal codes are used in the Customers table, but for my aim I need all postal codes assigned to some employee to created something like "area of service", so to do that I want to fill null values with last not null value to get something like this:

| postal | IDemp |
 -------- ------- 
| 00-000 | NULL  |
| 00-001 | 12PH  |
| 00-002 | 12PH  |
| 00-003 | 12PH  |
| 00-004 | 10PH  |
| 00-005 | 10PH  |
| ...    | ...   |

I was trying to use LAG() function, but it was not working (or at least I don't know how use it properly)

LAG(K.IDemp) OVER (ORDER BY KP.postal)

I found few similar questions already, but could not come up how to use their answers to my case.

CodePudding user response:

SQL Server doesn't support the ignore nulls option on LAG (yet), but you can get around this by creating a binary value from the column you want to order by, and the column you want to retrieve and calling MAX which does ignore nulls. A full working solution would be:

IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
    DROP TABLE #T;

CREATE TABLE #T (Postal VARCHAR(6) NOT NULL, IDemp VARCHAR(4) NULL);
INSERT #T (Postal, IDemp)
VALUES
    ('00-000', NULL),
    ('00-001', '12PH'),
    ('00-002', NULL),
    ('00-003', NULL),
    ('00-004', '10PH'),
    ('00-005', NULL);


SELECT  *,
        LastNonNull = CONVERT(VARCHAR(6), 
                            SUBSTRING(
                                MAX(CONVERT(BINARY(6), Postal)   CONVERT(BINARY(4), IDemp)) 
                                    OVER(ORDER BY Postal), 7,4))
FROM    #T;

It might help explain if this is broken down a bit and we look at the results of this:

SELECT  *,
        BinaryValue = CONVERT(BINARY(6), Postal)   CONVERT(BINARY(4), IDemp)
FROM    #T
Postal IDemp BinaryValue
00-000 NULL NULL
00-001 12PH 0x30302D30303131325048
00-002 NULL NULL
00-003 NULL NULL
00-004 10PH 0x30302D30303431305048
00-005 NULL NULL

Since concatenating null value yields a null value, you only get a value where it is not null. You can then take advantage of binary sorting (left to right) and get the maximum value of this binary within a windowed function, that is the part: MAX(...) OVER(ORDER BY Postal).

This removes all the NULL values (since MAX ignores NULL) apart from the first row, since there is no previous non null value and gives data as follows:

Postal IDemp MaxBinaryValue
00-000 NULL NULL
00-001 12PH 0x30302D30303131325048
00-002 NULL 0x30302D30303131325048
00-003 NULL 0x30302D30303131325048
00-004 10PH 0x30302D30303431305048
00-005 NULL 0x30302D30303431305048

It is then just a case of extracting the portion of the binary you are interested in (characters 7-10) and converting back to varchar using SUBSTRING and CONVERT

CodePudding user response:

A correlated sub-query might work:

SELECT DISTINCT
    KP.postal,
    (SELECT TOP 1 K.IDemp 
     FROM [Customers] K
     WHERE K.postal <= KP.postal
     AND K.IDemp Is Not Null
     ORDER BY K.postal DESC) As IDemp
FROM
    PostalCodes KP 
  • Related