I have balance of each account on working days, I need to calculate Avg_Balance
but I need to calculate on all days of the month including weekend for which data is missing. I want to repeat last available balances for accounts for missing dates.
For example:
BDate | CRM | ACCT | BAL |
---|---|---|---|
2023-10-24 | 123 | ab123 | 1000 |
2023-10-25 | 123 | ab123 | 1100 |
2023-10-27 | 123 | ab123 | 1200 |
2023-10-28 | 123 | ab123 | 1300 |
As you can see data is missing for 2023-10_26, I want replicate the data of 2023-10-25, but with date 2023-10-26 as given below.
BDate | CRM | ACCT | BAL |
---|---|---|---|
2023-10-24 | 123 | ab123 | 1000 |
2023-10-25 | 123 | ab123 | 1100 |
2023-10-26 | 123 | ab123 | 1100 |
2023-10-27 | 123 | ab123 | 1200 |
2023-10-28 | 123 | ab123 | 1300 |
How to do this in a SQL Server query? I'll appreciate your help.
CodePudding user response:
Assuming you have a date dimension table, you could join this to get a table with gaps and islands. There would be then multiple options to fill the gaps (NULL values). Here is a classic one:
First, let's build up your example:
CREATE TABLE RUNNUNG_BALANCE_TABLE (
BDATE DATE NOT NULL,
CRM INT,
ACCT NVARCHAR(5),
BAL INT
);
INSERT INTO RUNNUNG_BALANCE_TABLE
VALUES
('2023-10-24',123,'ab123',1000),
('2023-10-25',123,'ab123',1100),
('2023-10-27',123,'ab123',1200),
('2023-10-28',123,'ab123',1300);
CREATE TABLE DIM_DATE (
DATEKEY INT NOT NULL,
DATE_ISO DATE NOT NULL
);
INSERT INTO DIM_DATE
VALUES
(20231024,'2023-10-24'),
(20231025,'2023-10-25'),
(20231026,'2023-10-26'),
(20231027,'2023-10-27'),
(20231028,'2023-10-28'),
(20231029,'2023-10-29'),
(20231030,'2023-10-30');
We then use a Common Table Expression (CTE) to join the financial data table with the table containing the date values, and create a new column to flag groups of entries. We do this by using an OVER clause with COUNT as an aggregate function. We use this column to determine the 'last available' data. We can use MIN() here, because any value is always greater than NULL.
WITH CTE AS (
SELECT
DATE_ISO,
CRM,
ACCT,
BAL,
COUNT(CASE WHEN BAL IS NOT NULL THEN 1 END) OVER (ORDER BY DATE_ISO) AS grp
FROM RUNNUNG_BALANCE_TABLE
RIGHT JOIN DIM_DATE ON DATE_ISO = BDATE
)
SELECT
DATE_ISO AS BDATE,
MIN(CRM) OVER (PARTITION BY grp) AS CRM,
MIN(ACCT) OVER (PARTITION BY grp) AS ACCT,
MIN(BAL) OVER (PARTITION BY grp) AS BAL
FROM CTE;
The CTE would look like this:
DATE_ISO | CRM | ACCT | BAL | Grp |
---|---|---|---|---|
2023-10-24 | 123 | ab123 | 1000 | 1 |
2023-10-25 | 123 | ab123 | 1100 | 2 |
2023-10-26 | (null) | (null) | (null) | 2 |
2023-10-27 | 123 | ab123 | 1200 | 3 |
2023-10-28 | 123 | ab123 | 1300 | 4 |
2023-10-29 | (null) | (null) | (null) | 4 |
2023-10-30 | (null) | (null) | (null) | 4 |
And the final result would look like this:
BDATE | CRM | ACCT | BAL |
---|---|---|---|
2023-10-24 | 123 | ab123 | 1000 |
2023-10-25 | 123 | ab123 | 1100 |
2023-10-26 | 123 | ab123 | 1100 |
2023-10-27 | 123 | ab123 | 1200 |
2023-10-28 | 123 | ab123 | 1300 |
2023-10-29 | 123 | ab123 | 1300 |
2023-10-30 | 123 | ab123 | 1300 |
CodePudding user response:
Your query can be achieved with a table variable and a while loop. Using MIN and MAX to get the date range you want for the calculations your while loop can simply check if data exists on the day in question and if not, utilize the DATEADD() function to get the previous days data. I have included how to achieve the result below but had to make a couple of assumptions about the structure of the table.
-- Create a table variable to store the data...
DECLARE @CheckBalance AS TABLE
(
BDate DATE NOT NULL,
CRM VARCHAR(3) NOT NULL,
ACCT VARCHAR(5) NULL,
BAL INT
);
DECLARE @StartDate DATE, @EndDate DATE, @WorkingDate DATE;
-- Get the starting date and the end date you want to calculate...
-- Add an extra day to the end date for the while loop check.
SELECT @StartDate = MIN(BDate), @EndDate = DATEADD(DAY, 1, MAX(BDate)) FROM RunningBalance;
SELECT @WorkingDate = @StartDate;
WHILE @WorkingDate < @EndDate
BEGIN
IF NOT EXISTS(SELECT 1 FROM RunningBalance WHERE BDate = @WorkingDate)
BEGIN
-- Since the data is missing from the WorkingDate, insert the previous days data.
INSERT INTO @CheckBalance (BDate, CRM, ACCT, BAL)
SELECT @WorkingDate AS BDate, CRM, ACCT, BAL FROM RunningBalance WHERE BDate = DATEADD(DAY, -1, @WorkingDate);
END
ELSE
BEGIN
-- Data exists, insert that.
INSERT INTO @CheckBalance (BDate, CRM, ACCT, BAL)
SELECT BDate, CRM, ACCT, BAL FROM RunningBalance WHERE BDate = @WorkingDate;
END
SELECT @WorkingDate = DATEADD(DAY, 1, @WorkingDate);
END
SELECT BDate, CRM, ACCT, BAL FROM @CheckBalance;