I have two tables: Budget Line
and Expense
. They are structured in such a way that an expense must have either a parent record in the budget line table, or a parent record in the expense table. I need to select all child-most expenses for each budget line.
For example - BudgetLine
:
Id | Description |
---|---|
1 | TEST123 |
2 | OTHERTEST |
Expense
:
Id | ParentId | ParentType | Description |
---|---|---|---|
1 | 1 | BudgetLine | Group of Expenses |
2 | 1 | Expense | Expense # 1 |
3 | 1 | Expense | Expense # 2 |
4 | 2 | BudgetLine | Expense 3 |
Desired result:
BudgetLineId | ExpenseId | Description |
---|---|---|
1 | 2 | Expense # 1 |
1 | 3 | Expense # 2 |
2 | 4 | Expense # 3 |
I am looking to omit expenses in the result only if they are the only sub-child. Note that an expense may have many children, grandchildren, etc.
I have tried the following, and researching various recursive CTE methods:
WITH RCTE AS
(
SELECT Expense.Id, Expense.ParentId, Expense.ParentType, 1 AS Lvl, Expense.Id as startId FROM Expense
UNION ALL
SELECT rh.Id, rh.ParentId, rh.ParentType, Lvl 1 AS Lvl, rc.Id as startId FROM dbo.Expense rh
INNER JOIN RCTE rc ON rh.Id = rc.ParentId and rc.ParentType = 'Expense'
),
FilteredRCTE AS
(
SELECT startId, MAX(LVL) AS Lvl
FROM RCTE
GROUP BY startID
),
RecursiveData AS
(
SELECT FilteredRCTE.startId AS ExpenseId, RCTE.ParentId AS BudgetLineId
FROM FilteredRCTE
JOIN RCTE ON FilteredRCTE.startId = RCTE.startId AND FilteredRCTE.Lvl = RCTE.Lvl
)
SELECT *
FROM RecursiveData
Which did in-fact obtain all the child Expenses and their associated parent BudgetLine, but it also included the middle-tier expenses (such as item 1 in the example) and I cannot figure out how to filter those middle-tier items out.
Here is a script to create tables / insert sample data:
CREATE TABLE [dbo].[BudgetLine]
(
[Id] [int] IDENTITY(1,1) NOT NULL,
[Description] [varchar](500) NULL,
) ON [PRIMARY]
GO
INSERT INTO dbo.BudgetLine VALUES ('TEST123')
INSERT INTO dbo.BudgetLine VALUES ('OTHERTEST')
CREATE TABLE [dbo].[Expense]
(
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NOT NULL,
[ParentType] [varchar](100) NOT NULL,
[Description] [varchar](max) NULL,
) ON [PRIMARY]
GO
INSERT INTO dbo.Expense VALUES ('1', 'BudgetLine', 'Group of Expenses')
INSERT INTO dbo.Expense VALUES ('1', 'Expense', 'Expense # 1')
INSERT INTO dbo.Expense VALUES ('1', 'Expense', 'Expense # 2')
INSERT INTO dbo.Expense VALUES ('2', 'BudgetLine', 'Expense # 3')
CodePudding user response:
Maybe I have oversimplified, but the following returns your desired results, by checking that there is no other expense row connected to the current row.
WITH RCTE AS
(
SELECT E.Id ExpenseId, E.ParentId, E.ParentType
FROM #Expense E
UNION ALL
SELECT RH.Id, RH.ParentId, RH.ParentType
FROM #Expense RH
INNER JOIN RCTE RC ON RH.Id = RC.ParentId AND RC.ParentType = 'Expense'
)
SELECT *
FROM RCTE R1
WHERE NOT EXISTS (
SELECT 1
FROM RCTE R2
WHERE R2.ParentId = R1.ExpenseId AND R2.ParentType = 'Expense'
);