Home > Mobile >  Valid From and Valid To columns - Only Show Latest Row based on Value between Dates
Valid From and Valid To columns - Only Show Latest Row based on Value between Dates

Time:10-21

Tearing my hair out over this one and not 100% sure it's possible, but everything is possible in SQL so it must be?!

The aim is to be able to historically track how many active accounts there were at the end of each month.

I have two tables:

  • star_UserAccounts

    • Holds account information every hour.
    • If the data is the same, nothing happens.
    • If the data changes, the existing row is flagged as old and the new row is inserted. This enables a nice little history to see what changes are being made.
  • LastDayOfMonth

    • Holds dates that I'd like the data for (just as an example)

Within the "star_UserAccounts" table I have (Ascending):

UserId UserStatus InsertedDate ValidToDate Row Version
JoeBloggs Active 2019-07-19 13:43:09.083 2019-10-31 16:08:27.633 1
JoeBloggs Active 2019-10-31 16:08:28.027 2020-01-09 10:08:27.840 2
JoeBloggs Active 2020-01-09 10:08:28.013 2020-01-09 11:08:28.813 3
JoeBloggs Active 2020-01-09 11:08:28.970 2020-01-16 11:08:24.547 4

Now - I would like to return the details that were valid on the 31st October 2019. There were two updates on that day and therefore the LATEST RowVersion for that day should be returned.

I can get the individual row version for the account when hardcoding the date into my select script, however when I then try to apply this across multiple different dates in one go, it is picking up both rows for the 31st October 2019, but I only want it to fetch the latest row on the days needed.

LastDayOfMonth ActiveAtDate
2019-07-31 1
2019-08-31 1
2019-09-30 1
2019-10-31 2 <-- The problem row
2019-11-30 1
2019-12-31 1

How an earth can I make sure the latest row is pulled for each date that is being pushed in?

Here is a SQL Fiddle to play with: http://sqlfiddle.com/#!18/ed372/12

I have kept in the syntax error which is within the subquery, as this sort of shows what I am trying to achieve as I need to pass the date into the sub query that collects the MAX record for the date.

Appreciate any assistance. Such a hard one to explain in an easy way, apologies if it's very confusing.

Thanks.

CodePudding user response:

You are over-complicating it. You can use simple row-numbering to do this

SELECT
    LastDayOfMonth,
    COUNT(*) ActiveAtDate
FROM (
    SELECT
      ld.LastDayOfMonth,
      ua.UserId,
      rn = ROW_NUMBER() OVER (PARTITION BY ld.LastDayOfMonth, ua.UserId ORDER BY ua.ValidToDate DESC)
    FROM dbo.star_UserAccounts ua
    INNER JOIN LastDayOfMonth ld
      ON ld.LastDayOfMonth >= ua.InsertedDate
      AND DATEADD(day, 1, ld.LastDayOfMonth) < ua.ValidToDate
    WHERE ua.UserId = 'JoeBloggs'
) latestRowForDate
WHERE rn = 1
GROUP BY LastDayOfMonth;

SQL Fiddle

CodePudding user response:

UPDATE

After re-reading your post I realized I had not fully understood your question prior to answering. I will leave my original answer as it may prove to be helpful.

Your situation may be as simple as using COUNT DISTINCT, like so:

SELECT
    LastDayOfMonth,
    COUNT ( DISTINCT UserId ) AS ActiveAtDate
FROM @star_UserAccounts AS ua
CROSS APPLY (

    SELECT LastDayOfMonth FROM @LastDayOfMonth AS ld
        WHERE ld.LastDayOfMonth BETWEEN CAST( ua.InsertedDate AS date ) AND CAST( ua.ValidToDate AS date )
        
) AS x
GROUP BY
    LastDayOfMonth
ORDER BY
    LastDayOfMonth;

RETURNS

 ---------------- -------------- 
| LastDayOfMonth | ActiveAtDate |
 ---------------- -------------- 
| 2019-07-31     |            1 |
| 2019-08-31     |            1 |
| 2019-09-30     |            1 |
| 2019-10-31     |            1 |
| 2019-11-30     |            1 |
| 2019-12-31     |            1 |
 ---------------- -------------- 

Using CROSS APPLY still has the same effect in limiting the resultset to users with activity during the period in the LastDayOfMonth table.

ORIGINAL ANSWER BELOW

Try something like the following:

DECLARE @star_UserAccounts table (
    [UserId] nvarchar (20) COLLATE Latin1_General_CI_AS NOT NULL,
    [InsertedDate] datetime NULL,
    [ValidToDate] datetime NULL,
    [RowVersion] int NOT NULL
);

INSERT INTO @star_UserAccounts ( [UserId], [InsertedDate], [ValidToDate], [RowVersion] )
VALUES
    ('JoeBloggs', '2020-01-09 11:08:28.970', '2020-01-16 11:08:24.547', 4 ), 
    ('JoeBloggs', '2020-01-09 10:08:28.013', '2020-01-09 11:08:28.813', 3 ), 
    ('JoeBloggs', '2019-10-31 16:08:28.027', '2020-01-09 10:08:27.840', 2 ), 
    ('JoeBloggs', '2019-07-19 13:43:09.083', '2019-10-31 16:08:27.633', 1 );

DECLARE @LastDayOFMonth table (
    LastDayOfMonth date
);

INSERT INTO @LastDayOFMonth ([LastDayOfMonth])
VALUES
    ('2019-07-31 00:00:00' ), 
    ('2019-08-31 00:00:00' ), 
    ('2019-09-30 00:00:00' ), 
    ('2019-10-31 00:00:00' ), 
    ('2019-11-30 00:00:00' ), 
    ('2019-12-31 00:00:00' );

;WITH current_activity AS (
    SELECT
        UserID, MAX( RowVersion ) AS CurrentVersion
    FROM @star_UserAccounts
    GROUP BY
        UserID
)
SELECT
    ua.*
FROM current_activity
INNER JOIN @star_UserAccounts AS ua
    ON current_activity.UserID = ua.UserID
    AND current_activity.CurrentVersion = ua.[RowVersion]
WHERE EXISTS (
    
    SELECT * FROM @star_UserAccounts AS u
    CROSS APPLY (
        SELECT * FROM @LastDayOfMonth AS ld
            WHERE CAST( ld.LastDayOfMonth AS date ) BETWEEN CAST( u.InsertedDate AS date ) AND CAST( u.ValidToDate AS date )
    ) AS d
    WHERE
        u.UserId = current_activity.UserId

)
ORDER BY
    current_activity.UserId;

RETURNS

 ----------- ------------------------- ------------------------- ------------ 
|  UserId   |      InsertedDate       |       ValidToDate       | RowVersion |
 ----------- ------------------------- ------------------------- ------------ 
| JoeBloggs | 2020-01-09 11:08:28.970 | 2020-01-16 11:08:24.547 |          4 |
 ----------- ------------------------- ------------------------- ------------ 

The CTE gets the most current UserId's RowVersion and then looks to see if the user had any activity during the dates listed in LastDayOfMonth.

Using CROSS APPLY restricts the results to users with any activity that includes the LastDayOfMonth.

  • Related