Counting records for each day where a child record has a certain value-CodePudding

I have two tables (see SQL scripts to create tables and data at the end):

Order table with columns

ID
Date

OrderItems table with columns

OrderID
ItemTypeID

I need to select the count of orders for every day that have at least one item with ItemTypeID=6000 and the count of orders that do not have ItemTypeID=6000 in the order at all. So far I got this, but I have no clue how to move on:

SELECT
    DATEADD(dd, 0, DATEDIFF(dd, 0, OrderDate)) AS OrderDate,
    COUNT(DISTINCT(Orders.ID)) AS TotalOrders
FROM
    Orders
JOIN
    OrderItems
ON
    Orders.ID = OrderItems.OrderID 
WHERE
    OrderItems.ItemTypeID = 6000
GROUP BY
    DATEADD(dd, 0, DATEDIFF(dd, 0, OrderDate))

Script to create test data:

CREATE TABLE OrderItems(
    [ID] [int] NOT NULL,
    [OrderID] [int] NOT NULL,
    [ItemTypeID] [int] NOT NULL
)
CREATE TABLE Orders(
    [ID] [int] NOT NULL,
    [OrderDate] [date] NOT NULL
)
GO
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (1, 1, 1000)
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (2, 1, 6000)
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (3, 2, 1000)
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (4, 3, 1000)
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (5, 3, 1000)
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (6, 4, 1000)
INSERT [OrderItems] ([ID], [OrderID], [ItemTypeID]) VALUES (7, 4, 6000)
INSERT [Orders] ([ID], [OrderDate]) VALUES (1, CAST(N'2021-12-01' AS Date))
INSERT [Orders] ([ID], [OrderDate]) VALUES (2, CAST(N'2021-12-01' AS Date))
INSERT [Orders] ([ID], [OrderDate]) VALUES (3, CAST(N'2021-12-02' AS Date))
INSERT [Orders] ([ID], [OrderDate]) VALUES (4, CAST(N'2021-12-03' AS Date))
GO

The expected results should look like this:

OrderDate   OrdersWithItem  OrdersWithoutItem
2021-12-01  1               1
2021-12-02  0               1
2021-12-03  1               0

CodePudding user response：

(Similar to xQbert's answer) I usually use a CROSS APPLY to calculate an intermediate value that I can then use to feed later logic - in this case a flag indicating whether or not an order has the desired item.

Try:

SELECT
    O.OrderDate,
    OrdersWithItem = COUNT(CASE WHEN X.HasItem = 1 THEN 1 END),
    OrdersWithoutItem = COUNT(CASE WHEN X.HasItem = 0 THEN 1 END)
FROM Orders O
CROSS APPLY (
    SELECT HasItem = CASE WHEN EXISTS(
        SELECT * FROM OrderItems OI WHERE OI.OrderID = O.ID AND OI.ItemTypeID = 6000
        ) THEN 1 ELSE 0 END
) X
GROUP BY O.OrderDate
ORDER BY O.OrderDate

Note the "1" in the "THEN 1" in the above "COUNT(CASE...)" style is arbitrary. It just needs to be distinguishable from the implied "ELSE NULL" case.

CodePudding user response：

You can use logic in an aggregation function like COUNT

SELECT
  OrderDate
, COUNT(DISTINCT CASE WHEN ItemTypeID =  6000 THEN ItemTypeID END) AS OrdersWithItem 
, COUNT(DISTINCT CASE WHEN ItemTypeID != 6000 THEN ItemTypeID END) AS OrdersWithoutItem
FROM Orders
JOIN OrderItems
  ON Orders.ID = OrderItems.OrderID 
GROUP BY OrderDate

Do note however that, because of the DISTINCT, that it'll make a difference what ID is used in the ELSE.

SELECT
  OrderDate
, COUNT(DISTINCT CASE WHEN ItemTypeID =  6000 THEN ItemTypeID END) AS OrdersWithItem 
, COUNT(DISTINCT CASE WHEN ItemTypeID != 6000 THEN ItemTypeID END) AS OrdersWithoutItem
FROM Orders
JOIN OrderItems
  ON Orders.ID = OrderItems.OrderID 
GROUP BY OrderDate

OrderDate  | OrdersWithItem | OrdersWithoutItem
:--------- | -------------: | ----------------:
2021-12-01 |              1 |                 1
2021-12-02 |              0 |                 1
2021-12-03 |              1 |                 1

SELECT
  OrderDate
, COUNT(DISTINCT CASE WHEN ItemTypeID =  6000 THEN Orders.ID END) AS OrdersWithItem 
, COUNT(DISTINCT CASE WHEN ItemTypeID != 6000 THEN Orders.ID END) AS OrdersWithoutItem
FROM Orders
JOIN OrderItems
  ON Orders.ID = OrderItems.OrderID 
GROUP BY OrderDate

OrderDate  | OrdersWithItem | OrdersWithoutItem
:--------- | -------------: | ----------------:
2021-12-01 |              1 |                 2
2021-12-02 |              0 |                 1
2021-12-03 |              1 |                 1

db<>fiddle here

Given the names OrdersWithItem & OrdersWithoutItem, it would seem the 2nd is more correct.

CodePudding user response：

TESTED: DBFiddle.uk Example

My theory here is we can use an outer apply to get the top 1 row of an order item which has a desired item. count those, and simply subtract that count from a total count to get those that do not. We use coalesce to handle the NULL results from the outer apply when an order doesn't have the item in question.

You really need to add to your test data though as you're not hitting enough test cases to know if the solutions will meet all your needs.

SELECT O.OrderDate
     , count(Z.hasItem)  OrdersWithItem
     , count(*)-count(Z.HasItem) as OrdersWithoutItems
FROM Orders O
OUTER APPLY (SELECT TOP 1 1 as hasItem
             FROM OrderItems OI
             WHERE OI.ItemTypeID=6000 
               AND O.ID = OI.OrderID
             ORDER BY OI.ID ) z
GROUP BY O.OrderDate

CodePudding user response：

A slightly different answer:

select 
    O.OrderDate,
    count(case when Typ6000.OrderID is not null then O.ID end) as OrdersWithItem,
    count(case when Typ6000.OrderID is null then O.ID end) as OrdersWithoutItem
    
from #Orders O
     
     left join
     (
      select distinct OrderId
      from #OrderItems OI
      where OI.ItemTypeID=6000
     )  Typ6000
     on Typ6000.OrderId=O.ID

group by O.OrderDate