Home > database >  SQL Server and difficulty generating column with unique values
SQL Server and difficulty generating column with unique values

Time:11-01

I'm using Microsoft SQL Server Management Studio, and I want a new column that calculates the following:

  • If it has an ‘Exec’ value for category, it takes the ‘enddate’.
  • If it has an ‘Scop’ value for category, it takes the ‘start date’.

This new column calculates the number of months between these two.

I want SQL to do the calculation for a given id, so each id will have different values calculated.

At the moment it takes the minimum enddate and minimum 'startdate' for the entire table.

SELECT
    id, category, startdate, enddate,
    CASE
        WHEN id = id 
            THEN DATEDIFF(month, 
                          (SELECT MIN(enddate) from [A].[PP] where category = 'Exec'),
                          (SELECT MIN(startdate) from [A].[PP] where category = 'Scop')) --AS datemodify
            ELSE NULL
    END
FROM
    [A].[PP]
WHERE
    startdate IS NOT NULL
    AND (category = 'Exec' OR category = 'Scop')
ORDER BY 
    id ASC

Results it produces at the moment:

id category startdate enddate NewCOlumn
1 Scop 2022-11-1 2022-10-1 11
1 Exec 2023-11-1 2023-10-1 11
2 Scop 2022-11-1 2022-10-1 11
2 Exec 2023-11-1 2023-09-1 11

The results I want:

id category startdate enddate NewCOlumn
1 Scop 2021-11-1 2022-10-1 24
1 Exec 2023-11-1 2023-11-1 24
2 Scop 2022-11-1 2022-10-1 11
2 Exec 2023-11-1 2023-09-1 11

CodePudding user response:

Based on comments I'm not sure you still know you want as your output so I've come up with two different versions.

Here's how I'm created a version of your data set:

INSERT INTO #TempTable (ID, Category, StartDate, EndDate)
VALUES (1, 'Scop', '2021-11-01', '2022-10-01'),
       (1, 'Exec', '2023-11-01', '2023-10-01'),
       (2, 'Scop', '2022-11-01', '2022-10-01'),
       (2, 'Exec', '2023-11-01', '2023-10-01');

This is the first version, this created your two lines per ID but hacks the StartDate and EndDate from different rows. This works by selecting all of the data straight out of the temp table, it then goes on to say if the row is Category = Scop then do a DateDiff between the StartDate and then fetches the EndDate from a subquery where the IDs match and the Category = Exec (it also has the same logic applied but the other way around for where the initial Category = Exec):

SELECT TT.ID,
       TT.Category,
       TT.StartDate,
       TT.EndDate,
       CASE
           WHEN TT.Category = 'Scop' THEN DATEDIFF(M, TT.StartDate, (SELECT EndDate FROM #TempTable WHERE Category = 'Exec' AND ID = TT.ID))
           ELSE CASE
                    WHEN TT.Category = 'Exec' THEN DATEDIFF(M, (SELECT StartDate FROM #TempTable WHERE Category = 'Scop' AND ID = TT.ID), TT.EndDate)
                END
       END AS DateDiffCalc
FROM #TempTable AS TT;

This version compresses the IDs to a single row, it initially only fetches Scop data, but then joins back to itself using ID and specifices now to get the Exec data only. Now you can DateDiff between the Scop StartDate and the Exec EndDate

SELECT DISTINCT t1.ID,
                t1.Category,
                t1.StartDate,
                T2.Category,
                T2.EndDate,
                DATEDIFF(M, t1.StartDate, T2.EndDate) AS DateDiffCalc
FROM #TempTable AS t1
     INNER JOIN #TempTable AS T2 ON T2.ID = T2.ID AND T2.Category = 'Exec'
WHERE t1.Category = 'Scop'
ORDER BY t1.ID;
  • Related