I have two tables Response
and Distributions
:
Table structure with expected output
In this case, there are multiple responses but 1 distribution. We need to tie just one latest response before the assignment date to the distribution, basically
MAX(COALESCE(RESPONSE_DATE, CREATED_DATE)) <= ASSIGNMENT_DATE
The SQL query I tried:
SELECT
resp.CONTACT_ID, resp.RESPONSE_ID, resp.RESPONSE_DATE,
resp.CREATED_DATE, d.ASSIGNMENT_DATE AS DISTRIBUTION_DATE
FROM
Response resp
LEFT JOIN
Distribution d ON resp.CONTACT_ID = d.CONTACT_ID
-- 12 Hour Grace Period For assignments created before response
AND DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) <= d.ASSIGNMENT_DATE
This query returns DISTRIBUTION_DATE
as 2020-10-28 for first two rows when.
This condition
DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) <= ASSIGNMENT_DATE
is satisfied (see SQL query returning wrong output table in the attached screenshot).
However, I want DISTRIBUTION_DATE
as "2020-10-28" only for the second row as shown in the expected output. The reason is the latest response just before the assignment date will get distributed and I don't care about initial X responses (We should tie one distribution to only one latest response)
I tried to use
MAX(DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE))) <= ASSIGNMENT_DATE
in the JOIN ON
condition but that doesn't work in SQL.
Please let me know how to structure the query and get the expected output.
NOTE: The join from Response
to Distribution
has to be on CONTACT_ID
, there is no explicit JOIN_KEY
and it can lead to 1:M joins as we have same CONTACT_ID
(that's the reason a lot of filtering is done in join ON condition), ideal scenario is to have RESPONSE_ID
on Distribution
table as well but that's not how the data is structured.
TIA
CodePudding user response:
I am assuming that the response date can be null which is why you are using COALESCE.
The following query will return the last response whose date is less than assignment date.
SELECT d.*, r.*
FROM [dbo].[Response] r
INNER JOIN [dbo].[Distribution] d
ON r.CONTACT_ID = d.CONTACT_ID
WHERE COALESCE(r.[RESPONSE_DATE],[CREATED_DATE]) =
(
SELECT MAX(COALESCE(rr.[RESPONSE_DATE],rr.[CREATED_DATE]))
FROM [dbo].[Response] rr
INNER JOIN [dbo].[Distribution] dd
ON rr.CONTACT_ID = dd.CONTACT_ID
WHERE COALESCE(rr.[RESPONSE_DATE],rr.[CREATED_DATE]) < dd.ASSIGNMENT_DATE
AND rr.CONTACT_ID = r.CONTACT_ID
)
It may be possible to simplify this but it will return what you need. Also if there are two responses with the same date, both will be returned.
CodePudding user response:
If I understand what you require, you'll need to run a query like this to get the expected results:
SELECT CONTACT_ID, RESPONSE_ID, RESPONSE_DATE, CREATED_DATE, CASE WHEN RANKING = 1 THEN DISTRIBUTION_DATE ELSE NULL END AS DISTRIBUTION_DATE
FROM (
SELECT
resp.CONTACT_ID, resp.RESPONSE_ID, resp.RESPONSE_DATE,
resp.CREATED_DATE, d.ASSIGNMENT_DATE AS DISTRIBUTION_DATE,
DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) AS GRACE_DATE,
RANK() OVER (
PARTITION BY resp.CONTACT_ID
ORDER BY
CASE
WHEN DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) <= d.ASSIGNMENT_DATE
THEN DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE))
ELSE
'19000101'
END
DESC ) AS RANKING
FROM
Response resp
LEFT JOIN
[Distribution] d ON resp.CONTACT_ID = d.CONTACT_ID
) DerivedTable
ORDER BY CONTACT_ID, RESPONSE_ID
Basically, I use a rank to workout the closest date based on the assignment date, and then know that the ranking where it is 1 should be the one to show the distribution date.