Home > OS >  How do can I combine two columns to perform an aggregation?
How do can I combine two columns to perform an aggregation?

Time:01-21

Say I have a table with these columns:

person color date1 date2 aux car_type
1 green 1/10/22 null 11 suv
1 blue null 1/11/22 16 sedan
3 red 1/23/22 null 17 suv
3 yellow null 1/28/22 26 sedan
4 purple 1/27/22 null 14 suv
4 black null 1/21/22 17 sedan

I would like to obtain the earliest date a vehicle was bought, between a particular interval. If no vehicle was bought in that interval by a person, then that date will be null. If there are two vehicles bought in that interval by someone, the earliest one should be picked.

In other words, the following result set is the desired result if the interval is say, between 1/20/22 - 1/30/22 (inclusive):

person date car type
1 null null
3 1/23/22 suv
4 1/21/22 sedan

The pattern I thought of using for this is:

SELECT person, date, car_type 
FROM
(
    SELECT ROW_NUMBER() OVER (PARTITION BY person ORDER BY date ASC) AS rn, 
     person, 
     date, 
     car_type
    FROM car_table
) sub
WHERE sub.rn = 1;

But there are two date columns, and I'd like to know how I can combine them into one for this purpose; or if there is just a better way to achieve this.

CodePudding user response:

As I mentioned in the comments, assuming you are using the latest version of SQL Server (2022 at time of writing) you can use LEAST to get the lower value of the 2 date and time columns. LEAST ignores NULL values as well. It not clear if the logic for the dates should be based on if one of the dates should be between the dates or the lowest, so this may not give the required results for different data, but should give you enough to get there:

SELECT ID
INTO dbo.Person
FROM (VALUES(1),(3),(4))V(ID);

SELECT *
INTO dbo.BoughtVehicle
FROM (VALUES(1,'green','1/10/22',NULL,'11','suv'), --d/M/yy or M/d/yy are TERRIBLE date formats. Use a date and time data type!
            (1,'blue',NULL,'1/11/22','16','sedan'),
            (3,'red','1/23/22',NULL,'17','suv'),
            (3,'yellow',NULL,'1/28/22','26','sedan'),
            (4,'purple','1/27/22',NULL,'14','suv'),
            (4,'black',NULL,'1/21/22','17','sedan'))V(person,color,date1,date2,aux,car_type);
GO
WITH RNs AS(
    SELECT person,
           LEAST(date1,date2) AS date,
           car_type,
           ROW_NUMBER() OVER (PARTITION BY Person ORDER BY LEAST(date1,date2)) AS RN
    FROM dbo.BoughtVehicle
    WHERE (date1 >= '1/20/22' AND date1 <= '1/30/22')  --This uses ambiguous varchar literals, you should be using an unambiguous date format like yyyyMMdd against a date and time value
       OR (date2 >= '1/20/22' AND date2 <= '1/30/22'))
SELECT P.ID AS Person,
       RN.date,
       RN.car_type
FROM dbo.Person P
     LEFT JOIN RNs RN ON P.ID = RN.Person
                     AND RN.RN = 1;
GO

DROP TABLE dbo.BoughtVehicle;
DROP TABLE dbo.Person;
  • Related