I've got a query that first declares and inserts data into a temporary table. This table is being used to filter out the main table in sql server that contains customer data.
Query with dummy data filled into temp table is as follows:
Declare @myInputTable TABLE (Customer_input INT, Date_of_interest bigint)
insert into @myInputTable values(12345, '20140924'),(22234,'20210508')
select top 1
MainTable.CustomerID,
MainTable.PurchaseDate
From MainTable
Where (customerID in (select Customer_input from @myInputTable) and PurchaseDate <= (select Date_of_interest from @myInputTable))
When I comment out the 2nd customer date of interest the query runs normally and gives me the result, but moment it deals with multiple customer and date of interest combinations it spits out the following error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
I understand that the problem lies in "PurchaseDate <=...", but I am not sure how else I can filter out results to match customerIDs and only output the closest purchase date records to the date of interest of each customer. Any pointers would be appreciated!
Test data from MainTable:
CustomerID | PurchaseDate |
---|---|
12345 | 20150120 |
12345 | 20140213 |
12345 | 20120811 |
22234 | 20210419 |
22234 | 20220322 |
Expected result from the query overall:
CustomerID | PurchaseDate |
---|---|
12345 | 20140213 |
22234 | 20210419 |
The query should rule out all PurchaseDate that are >Date_of_interest and only display the closest past PurchaseDate to Date_of_interest of a given CustomerID (matched to Customer_input). Customer 12345 has 2 dates that qualify (20140213 and 20120811), but only 20140213 is outputted as it's the closest to the date of interest listed in myInputTable (12345, '20140924')
CodePudding user response:
I would suggest all you need is an inner join and then simple aggregation for your desired results:
select m.customerId, Max(PurchaseDate) PurchaseDate
From @myInputTable t
join MainTable m on m.CustomerID = t.Customer_input and m.PurchaseDate <= t.Date_of_interest
group by m.customerId;
CodePudding user response:
As you need to map each source row to a row from your variable table, you have to do a join and not simply check that a value exists somewhere in it. And as you need just one value per customer, you will need to use ROW_NUMBER
. Maybe something like this (not tested):
SELECT *
FROM (
SELECT
MT.CustomerID,
MT.PurchaseDate,
Number = ROW_NUMBER() OVER (PARTITION BY MT.CustomerID ORDER BY MT.PurchaseDate DESC)
FROM MainTable MT
INNER JOIN @myInputTable IT
ON MT.CustomerID = IT.Customer_input AND MT.PurchaseDate <= IT.Date_of_interest
) Subquery
WHERE Number = 1
ORDER BY CustomerID, PurchaseDate
CodePudding user response:
This can also be achieved by inner joining with the table variable and aggregating it.
DECLARE @myInputTable TABLE ( Customer_input INT, Date_of_interest INT ); insert into @myInputTable values (12345, '20140924') , (22234, '20210508') ; SELECT t.CustomerID , MAX(PurchaseDate) AS PurchaseDate FROM MainTable t JOIN @myInputTable i ON i.Customer_input = t.CustomerID AND i.Date_of_interest >= t.PurchaseDate GROUP BY t.CustomerID, i.Date_of_interest
CustomerID PurchaseDate 12345 20140213 22234 20210419
Test on db<>fiddle here
Side-note : Columns that contain dates should have the DATE datatype.
You can insert dates in the same "yyyyMMdd" format.