I am struggling to find a way to create a table in SQL where I can consolidate all the sales dates and the products available. I have been tasked to create a table that contains a range of dates between january 1st 2022 and december 31st 2022.
My data sample consists of where date between 01/01/2022 and 31/12/2022. The date table is a table where I generated dates for the year and I have to use it to match my products and sales.
CREATE TABLE Date_Table (
DateID INT IDENTITY CONSTRAINT PK_Date_Table PRIMARY KEY,
Date DATE NOT NULL,
Month TINYINT NOT NULL,
Year INT NOT NULL
);
CREATE TABLE Sales_Table (
SaleID INT IDENTITY CONSTRAINT PK_Sales_Table PRIMARY KEY,
Date_of_Sale DATE NOT NULL,
Qty_sold INT NOT NULL,
Total MONEY NOT NULL,
ProductID INT NOT NULL CONSTRAINT FK_Product_Table REFERENCES Product_table(ProductID));
);
CREATE TABLE Product_table (
ProductID INT IDENTITY CONSTRAINT PK_Product_Table PRIMARY KEY,
ProductName NVARCHAR(25) NOT NULL,
ProductNumber NVARCHAR(25) NOT NULL
);
I am yet to find a way to be able to create a table that looks like this:
Date | Product | "Qty Sold" |
---|---|---|
2022-01-01 | AB1 | 0 |
2022-01-01 | AB2 | 2 |
2022-01-01 | AB3 | 1 |
2022-01-01 | AB4 | 0 |
2022-01-01 | AB5 | 2 |
2022-01-02 | AB1 | 3 |
2022-01-02 | AB2 | 1 |
2022-01-02 | AB3 | 0 |
2022-01-02 | AB4 | 0 |
2022-01-02 | AB5 | 0 |
I thus want all the dates matched with all the products even if no sale has been made. I have been working at it for some time and I can't seem to find a solution. If anyone has a little idea that could help me start it would be extremely appreciated
So far, I have just made some joins between my tables hoping that a full join might allow me to keep all the data necessary but it does not work.
CodePudding user response:
The trick with this is to first create all your rows, then only after that do you populate with the quantity sold.
The easiest way to create all rows is to do a CROSS JOIN
e.g.,
SELECT Date_Table.[Date], Product_Table.ProductID, Product_Table.ProductName
FROM Date_Table
CROSS JOIN Product_Table
The above will give you a row for every product, on every date (so if 365 days for 5 products, will be 365 x 5 = 1825 rows).
With this, you then join it to your actual sales, using a LEFT JOIN so that it keeps your initial rows even if they don't have a match in sales. And to deal with the blank rows, use ISNULL to make them 0.
SELECT Date_Table.[Date], Product_Table.ProductID, Product_Table.ProductName,
ISNULL(Sales_Table.Qty_sold, 0) AS Qty_Sold
FROM (Date_Table
CROSS JOIN Product_Table)
LEFT OUTER JOIN Sales_Table ON Date_Table.[Date] = Sales_Table.[Date_of_sale]
AND Product_Table.ProductID = Sales_Table.ProductID
Edit: Here is a db<>fiddle with the data and SQL for you to review
Two notes about data structure setup
- Your PKs - I would usually try to make the PK meaningful (e.g., in a date table, use the date as the PK) - though it's not necessary. However, I would then suggest at least putting a nonclustered index on them so that you can look up the date (or product, etc) directly.
- I would also avoid using column names that are SQL keywords like date, month, year - and I would also consider naming columns with more meaning e.g., instead of 'Total', call it 'Total_Revenue' or 'Total_Cost'. Most of your fields follow this rule but ... just letting you know.