Home > front end >  SQL fill missing date and null value
SQL fill missing date and null value

Time:08-25

I am using SAS Enterprise Guide 8.3 to connect IBM DB2.

I want to join and fill missing date and values.

I have a full calendar date table from 1/1/2021 up to yesterday.

Each ID can work 5 days to 7 days a week.

There is no target for Sunday.

4/3/2022 is Sunday.

Code is

SELECT t2.ID, 
          t1.CAL_DT, COALESCE(t2.EXPRESS,0) AS EXPRESS, COALESCE(t2.OTHRES,0) AS OTHRES , COALESCE(t2.CPRO_RPT,0) AS Total
          FROM WORK.QUERY_FOR_DATE t1
           left outer JOIN WORK.QUERY_FOR_CPRO_0000 t2 on t1.cal_dt = t2.cal_dt
      ORDER BY t2.ID asc ,t1.CAL_DT asc;

enter image description here

Sample tables are below.

Table 1.

ID Date Express Others Total
001 4/1/2022 0 2 2
001 4/2/2022 2 3 5
001 4/4/2022 1 2 3
001 4/5/2022 2 2 4
002 4/1/2022 0 3 3
002 4/4/2022 3 3 6
002 4/5/2022 1 2 3
003 4/1/2022 3 3 6
003 4/2/2022 4 4 8
003 4/3/2022 1 1 2
003 4/4/2022 3 4 7
003 4/6/2022 2 4 6

Table 2.

ID Date Target
001 4/1/2022 4
001 4/2/2022 4
001 4/4/2022 4
001 4/5/2022 4
002 4/1/2022 6
002 4/2/2022 6
002 4/4/2022 6
002 4/5/2022 6
003 4/1/2022 8
003 4/2/2022 8
003 4/4/2022 8
003 4/5/2022 8

I want the result in Table 3.

ID Date Express Others Total Target
001 4/1/2022 0 2 2 4
001 4/2/2022 2 3 5 4
001 4/3/2022 0 0 0 0
001 4/4/2022 1 2 3 4
001 4/5/2022 2 2 4 4
002 4/1/2022 0 3 3 6
002 4/2/2022 0 0 0 6
002 4/3/2022 0 0 0 0
002 4/4/2022 3 3 6 6
002 4/5/2022 1 2 3 6
003 4/1/2022 3 3 6 8
003 4/2/2022 4 4 8 8
003 4/3/2022 1 1 2 0
003 4/4/2022 3 4 7 8
003 4/5/2022 2 4 6 8

CodePudding user response:

I would use your date table to cross join with your ID to build a permutation of those two, then you can join ID back into the set and get a full set of ID and Dates along with your real data from your CPRO table. You could join back in again between MIN and MAX date to remove all the dates in your date table.

SELECT X.ID, X.CAL_DT
, COALESCE(T3.EXPRESS,0) EXPRESS
, COALESCE(T3.OTHRES,0) OTHRES
, COALESCE(T3.CPRO_RPT,0) CPRO_RPT
FROM ( SELECT DISTINCT T2.ID, T1.CAL_DT FROM (WORK.QUERY_FOR_DATE t1 CROSS JOIN WORK.QUERY_FOR_CPRO_0000 t2)) X 
INNER JOIN WORK.QUERY_FOR_CPRO_0000 T3 
ON X.ID = T3.ID

You could also add another join to keep you date range correct if you didn't want to hard code it in the where clause.

SELECT 
X.ID
, X.CAL_DT 
, COALESCE(T3.EXPRESS,0) EXPRESS 
, COALESCE(T3.OTHRES,0) OTHRES 
, COALESCE(T3.CPRO_RPT,0) CPRO_RPT 
FROM 
( 
SELECT DISTINCT T2.ID, T1.CAL_DT 
FROM (WORK.QUERY_FOR_DATE t1 
CROSS JOIN WORK.QUERY_FOR_CPRO_0000 t2)) X 
INNER JOIN WORK.QUERY_FOR_CPRO_0000 T3 ON X.ID = T3.ID
INNER JOIN 
(SELECT MIN(DT) AS MINDATE, MAX(DT) AS MAXDATE
FROM WORK.QUERY_FOR_CPRO_000) AS DATEPARAM ON X.CAL_DT BETWEEN MINDATE AND MAXDATE

CodePudding user response:

Since you have a Calendar table, you may try the following:

Select B.id, C.CAL_DT, COALESCE(D.express, 0) express, COALESCE(D.others, 0) others,
       COALESCE(D.total, 0) total, COALESCE(E.target, 0) target
From Calendar C
Cross Join (Select Distinct id From Table1) B
Left Join table1 D
On D.id = B.id And D.date_ = C.CAL_DT
Left Join table2 E
On E.id = B.id And E.date_ = C.CAL_DT
Order By B.id, C.CAL_DT

First, you have to join each id from table1 to every day in your calendar table, and that done by cross join the calendar with the distinct ids from table1.

Now, left join that result to table1, table2 to get which ids not having entries for a specific date (null values).

The COALESCE function is used to replace null values with 0.

See a demo using DB2 from db<>fiddle.

  • Related