I have two queries that work perfectly:
DECLARE @StartDate DATETIME = '2021-11-01 00:00:00';
DECLARE @EndDate DATETIME = '2022-03-16 23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
COUNT(TransactionId) AS TransactionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND MasterRecord = 1
AND TransactionTypeId = @SalesEstimateTransactionTypeId
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
COUNT(DISTINCT TransactionId) AS ConversionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND MasterRecord = 0
AND TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
Note that the second query returns distinct because it can return multiple values and we only want to count each TransactionId
once in that scenario. These return the following results:
ReportingMonth | MonthNumber | ReportingYear | TransactionCount |
---|---|---|---|
November | 11 | 2021 | 82 |
December | 12 | 2021 | 49 |
January | 1 | 2022 | 64 |
February | 2 | 2022 | 67 |
March | 3 | 2022 | 49 |
ReportingMonth | MonthNumber | ReportingYear | ConversionCount |
---|---|---|---|
November | 11 | 2021 | 42 |
December | 12 | 2021 | 27 |
January | 1 | 2022 | 31 |
February | 2 | 2022 | 50 |
March | 3 | 2022 | 24 |
I actually need to combine them like this:
ReportingMonth | MonthNumber | ReportingYear | TransactionCount | ConversionCount |
---|---|---|---|---|
November | 11 | 2021 | 82 | 42 |
December | 12 | 2021 | 49 | 27 |
January | 1 | 2022 | 64 | 31 |
February | 2 | 2022 | 67 | 50 |
March | 3 | 2022 | 49 | 24 |
I have tried pretty much everything I can think of - Unions, Joins, Subqueries - but so far nothing is quite right. This is the closest I can get:
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
SUM(CASE
WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
AND MasterRecord = 1
THEN 1 ELSE 0
END) AS TransactionCount,
COUNT(CASE
WHEN TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0 THEN 1
END) AS ConversionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
However, I am unable to find a way to get a Distinct value for the ConversionCount
. As a result it is returning the full count:
ReportingMonth | MonthNumber | ReportingYear | TransactionCount | ConversionCount |
---|---|---|---|---|
November | 11 | 2021 | 82 | 152 |
December | 12 | 2021 | 49 | 67 |
January | 1 | 2022 | 64 | 101 |
February | 2 | 2022 | 67 | 136 |
March | 3 | 2022 | 49 | 64 |
Can anyone guide me towards a way to combine the two query results whilst maintaining the Distinct on the conversion count? I must add that for it to work the answer must be compatible with both SQL Server and VistaDB the syntax of which is a subset of T-SQL because I am obliged to support both database engines with the same query.
CodePudding user response:
You can just put the two columns in the same query. It is made more complicated by the fact that the WHERE
clauses are slightly different. SO you need to group, then group again, and use conditional aggregation to count the right rows for each column.
Note the following:
- You could in theory do
COUNT(DISTINCT CASE
however that is normally slower as the compiler will not recognize what theCASE
is doing and instead do a full sort. - It is faster to group by a single
EOMONTH
calculation to group by a whole month. You can pull out the year and month in theSELECT
. COUNT(TransactionId)
will return the number of non-nullTransactionId
values. ifTransactionId
cannot be null thenCOUNT(*)
is the same thing.- If
TransactionDate
has a time component then you should use a half-open interval>= AND <
- Use aliases on tables, it makes your queries more readable.
- Use whitepsace, it's free.
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-03-17T00:00:00';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
SELECT
DATENAME(month, gj.mth) AS ReportingMonth,
DATEPART(month, gj.mth) AS MonthNumber,
DATEPART(year , gj.mth) AS ReportingYear,
SUM(TransactionCount) AS TransactionCount,
COUNT(CASE WHEN ConversionCount > 0 THEN 1 END) AS ConversionCount
FROM (
SELECT
EOMONTH(gj.TransactionDate) AS mth,
gj.TransactionId,
COUNT(CASE WHEN gj.MasterRecord = 1 AND gj.TransactionTypeId = @SalesEstimateTransactionTypeId THEN 1 END) AS TransactionCount,
COUNT(CASE WHEN gj.MasterRecord = 0 AND AND TransactionTypeId = @SalesOrderTransactionTypeId AND AND SEReferenceId > 0 THEN 1 END) AS ConversionCount
FROM GeneralJournal gj
WHERE gj.TransactionDate >= @StartDate
AND gj.TransactionDate < @EndDate
AND gj.TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
EOMONTH(gj.TransactionDate),
TransactionId
) g
GROUP BY
mth;
CodePudding user response:
Your second query that is close, I think just has a couple of minor omissions.
- You forgot
MasterRecord = 0
in your ConversionCountCASE
statement. - Instead of returning 1 or 0 from your ConversionCount
CASE
you should return TransactionID or NULL so you can still count distinct values. - You are missing
DISTINCT
inside of your ConversionCountCOUNT
. - You will need to handle NULL values in the ConversionCount
COUNT
. I assumed you will always have at one or moreNULL
s, so I just subtract 1 from theCOUNT(DISTINCT ...)
to compensate.
(I can't be 100% on the syntax here without some example detail data to work with.)
Code
SELECT
ReportingMonth = DATENAME(mm, GeneralJournal.TransactionDate),
MonthNumber = DATEPART(mm, GeneralJournal.TransactionDate),
ReportingYear = DATEPART(yyyy, GeneralJournal.TransactionDate),
TransactionCount = SUM(CASE
WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
AND MasterRecord = 1 THEN
1
ELSE
0
END
),
ConversionCount = COUNT(DISTINCT CASE
WHEN TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0
AND MasterRecord = 0 THEN
TransactionID
ELSE
NULL
END
) - 1 /* Subtract 1 for the NULL */
FROM GeneralJournal
WHERE
GeneralJournal.TransactionDate >= @StartDate
AND GeneralJournal.TransactionDate <= @EndDate
AND TransactionTypeId IN (
@SalesOrderTransactionTypeId,
@SalesEstimateTransactionTypeId
)
GROUP BY
DATEPART(yyyy, GeneralJournal.TransactionDate),
DATEPART(mm, GeneralJournal.TransactionDate),
DATENAME(mm, GeneralJournal.TransactionDate);