Home > database >  Get Earliest Date corresponding to the latest occurrence of a recurring name
Get Earliest Date corresponding to the latest occurrence of a recurring name

Time:10-27

I have a table with Name and Date columns. I want to get the earliest date when the current name appeared. For example:

Name Date
X 30-Jan-2021
X 29-Jan-2021
X 28-Jan-2021
Y 27-Jan-2021
Y 26-Jan-2021
Y 25-Jan-2021
Y 24-Jan-2021
X 23-Jan-2021
X 22-Jan-2021

Now when I try to get the earliest date when current name (X) started to appear, I want 28-Jan, but the sql query would give 22-Jan-2021 because that's when X appeared originally for the first time.

CodePudding user response:

This is a gaps and island problem. Based on the sample data, this will work:

WITH Groups AS(
    SELECT YT.[Name],
           YT.[Date],
           ROW_NUMBER() OVER (ORDER BY YT.Date DESC) - 
           ROW_NUMBER() OVER (PARTITION BY YT.[Name] ORDER BY Date DESC) AS Grp
    FROM dbo.YourTable YT),
FirstGroup AS(
    SELECT TOP (1) WITH TIES
           G.[Name],
           G.[Date]
    FROM Groups G
    WHERE [Name] = 'X'
    ORDER BY Grp ASC)
SELECT MIN(FG.[Date]) AS Mi

db<>fiddle

CodePudding user response:

This is a type of gaps-and-islands problem.

There are many solutions. Here is one that is optimized for your case

  • Use LEAD/LAG to identify the first row in each grouping
  • Filter to only those rows
  • Number them rows and take the first one
WITH StartPoints AS (
    SELECT *,
      IsStart = CASE WHEN Name <> LEAD(Name, 1, '') OVER (ORDER BY Date DESC) THEN 1 END
    FROM YourTable
),
Numbered AS (
    SELECT *,
      rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
    FROM StartPoints
    WHERE IsStart = 1 AND Name = 'X'
)
SELECT
  Name, Date
FROM Numbered
WHERE rn = 1;

db<>fiddle

For SQL Server 2008 or earlier (which I strongly suggest you upgrade from), you can use a self-join with row-numbering to simulate LEAD/LAG

WITH RowNumbered AS (
    SELECT *,
      AllRn = ROW_NUMBER() OVER (ORDER BY Date ASC)
    FROM YourTable
),
StartPoints AS (
    SELECT r1.*,
      IsStart = CASE WHEN r1.Name <> ISNULL(r2.Name, '') THEN 1 END
    FROM RowNumbered r1
    LEFT JOIN RowNumbered r2 ON r2.AllRn = r1.AllRn - 1
),
Numbered AS (
    SELECT *,
      rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
    FROM StartPoints
    WHERE IsStart = 1
)
SELECT
  Name, Date
FROM Numbered
WHERE rn = 1;

CodePudding user response:

If i did understand, you want to know when the X disappeared and reappeared again. in that case you can search for gaps in dates by group.

this and example how to detect that

SELECT name
 ,DATE
FROM (
SELECT *
    ,DATEDIFF(day, lead(DATE) OVER (
            PARTITION BY name ORDER BY DATE DESC
            ), DATE) DIF
FROM YourTable
) a
WHERE DIF > 1
  • Related