I have 2 data tables, both with a datetime column however other than this column no other matching columns thus not allowing me to do a JOIN.
I have attempted to do the join on the MonthYear alias and this is not giving me the expected results and it is significantly slower to execute, whilst doing them individually is instant.
However, every month will have data and as I am doing a count by month and ordering I'm thinking a join may not be essential. I've tried a UNION however this provided the results in additional rows when I require it to be in adjacent columns (see desired outcome).
table1:
SELECT
LEFT(DATENAME(MONTH,[date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2) AS 'MonthYear',
COUNT(CASE WHEN responseType = 'positive' THEN 1 END) AS 'Positive',
COUNT(CASE WHEN responseType = 'negative' THEN 1 END) AS 'Negative'
FROM Database.dbo.Response
WHERE [date] BETWEEN '2022/09/01' AND '2022/12/01'
GROUP BY LEFT(DATENAME(MONTH,[date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2)
ORDER BY MAX([date])
table1 results:
MonthYear Positive Negative
Sep-22 8738 6001
Oct-22 10120 4512
Nov-22 5621 5451
table2:
SELECT
LEFT(DATENAME(MONTH,[date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2) AS 'MonthYear',
COUNT(CASE WHEN Reason = 'Legacy Unsub' THEN 1 END) AS 'Unsub',
COUNT(CASE WHEN Reason = 'Complaint' THEN 1 END) AS 'Complaint'
FROM Database.dbo.Complaint
WHERE [date] BETWEEN '2022/09/01' AND '2022/12/01'
GROUP BY LEFT(DATENAME(MONTH, [date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2)
ORDER BY MAX([date])
table2 results:
MonthYear Unsub Complaint
Sep-22 541 5
Oct-22 171 0
Nov-22 459 12
My desired outcome:
MonthYear Positive Negative Unsub Complaint
Sep-22 8738 6001 541 5
Oct-22 10120 4512 171 0
Nov-22 5621 5451 459 12
CodePudding user response:
I would expect the following to give you your expected output and perform not significantly worse than running the two queries individually (although it will ultimately depend on how many different MonthYear
s you are returning).
WITH t1 AS
(
SELECT
LEFT(DATENAME(MONTH,[date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2) AS 'MonthYear',
MAX([date]) AS 'SortOrder',
COUNT(CASE WHEN responseType = 'positive' THEN 1 END) AS 'Positive',
COUNT(CASE WHEN responseType = 'negative' THEN 1 END) AS 'Negative'
FROM Database.dbo.Response
WHERE [date] BETWEEN '2022/09/01' AND '2022/12/01'
GROUP BY LEFT(DATENAME(MONTH,[date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2)
),
t2 AS (
SELECT
LEFT(DATENAME(MONTH,[date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2) AS 'MonthYear',
COUNT(CASE WHEN Reason = 'Legacy Unsub' THEN 1 END) AS 'Unsub',
COUNT(CASE WHEN Reason = 'Complaint' THEN 1 END) AS 'Complaint'
FROM Database.dbo.Complaint
WHERE [date] BETWEEN '2022/09/01' AND '2022/12/01'
GROUP BY LEFT(DATENAME(MONTH, [date]),3) '-' RIGHT('00' CAST(YEAR([date]) AS VARCHAR),2)
)
SELECT t1.MonthYear, t1.Positive, t1.Negative, t2.Unsub, t2.Complaint
FROM t1 FULL OUTER JOIN t2
ON t1.MonthYear = t2.MonthYear
ORDER BY t1.SortOrder
This simply joins the results of your two existing queries (as CTEs) on MonthYear
. I added an additional SortOrder
column to the first CTE (since we want to sort in date order, not alphabetic order). You said "every month will have data", so perhaps an INNER JOIN
is sufficient in your case, but the FULL OUTER JOIN
is probably safer for things like this (where possibly one of the tables doesn't yet have data for a month that the other table has).
Although SQL isn't procedural so there are no guarantees, I expect SQL Server will run your existing queries in about the same time it takes to run them individually, and then matching, say, a few thousand MonthYear
s should be relatively insignificant.
If this does not have acceptable performance, I would consider adding a computed column MonthYear
to both tables and indexing that.
CodePudding user response:
You can drop both the result sets in a temporary table and use that one to do the joining.
SELECT
LEFT(DATENAME(MONTH, [date]), 3) '-'
RIGHT('00' CAST(YEAR([date]) AS VARCHAR), 2) AS 'MonthYear',
COUNT( CASE WHEN responseType = 'positive' THEN 1
END
) AS 'Positive',
COUNT( CASE WHEN responseType = 'negative' THEN 1
END
) AS 'Negative'
INTO #ResultsTable1
FROM [Database].dbo.Response
WHERE [date]
BETWEEN '2022/09/01' AND '2022/12/01'
GROUP BY LEFT(DATENAME(MONTH, [date]), 3) '-'
RIGHT('00' CAST(YEAR([date]) AS VARCHAR), 2)
ORDER BY MAX([date]);
SELECT
LEFT(DATENAME(MONTH, [date]), 3) '-'
RIGHT('00' CAST(YEAR([date]) AS VARCHAR), 2) AS 'MonthYear',
COUNT( CASE WHEN Reason = 'Legacy Unsub' THEN 1
END
) AS 'Unsub',
COUNT( CASE WHEN Reason = 'Complaint' THEN 1
END
) AS 'Complaint'
INTO #ResultsTable2
FROM [Database].dbo.Complaint
WHERE [date]
BETWEEN '2022/09/01' AND '2022/12/01'
GROUP BY LEFT(DATENAME(MONTH, [date]), 3) '-'
RIGHT('00' CAST(YEAR([date]) AS VARCHAR), 2)
ORDER BY MAX([date]);
SELECT * FROM #ResultsTable1 AS rt
LEFT JOIN #ResultsTable2 AS rt2 ON rt2.MonthYear = rt.MonthYear