Here is my SQL Server table
ID Job ParentID MyTeam
1 CEO NULL
2 CFO 1
3 CTO 1
4 CMO 1
5 Accounting Manager 2
6 Payroll Manager 2
7 Data Manager 3
8 Software Manager 3
9 Data Analyst 7
10 Data Engineer 7
I need to fill the MyTeam field this way
each job will have all people that job managing
CEO's team will be CEO, CFO, CTO, CMO, Accounting Manager, Payroll Manager, Data Manager, Software Manager, Data Analyst, Data Engineer
CFO's team will be CFO, Accounting Manager, Payroll Manager
CTO's team will be CTO, Data Manager, Software Manager, Data Analyst, Data Engineer
I built a loop on this data and contacted each job to its parent and so on
but this is too slow
Is there a faster one update statement to do that fast
CodePudding user response:
This approach uses a loop, but should be relatively fast
- It runs based on the number of levels in the jobs, rather than the individuals (so if you have 3 levels like here, it does 3 loops)
- The reads and calculations are very efficient - it doesn't need to do a lot of JOINs and sorting and table scans (other than the first)
There are examples of this running, with some extra reporting, in this dbfiddle
/* Initial data setup */
CREATE TABLE #Emps (ID int PRIMARY KEY, Job nvarchar(30), ParentID int);
INSERT INTO #Emps (ID, Job, ParentID) VALUES
(1, N'CEO', NULL),
(2, N'CFO', 1),
(3, N'CTO', 1),
(4, N'CMO', 1),
(5, N'Accounting Manager', 2),
(6, N'Payroll Manager', 2),
(7, N'Data Manager', 3),
(8, N'Software Manager', 3),
(9, N'Data Analyst', 7),
(10, N'Data Engineer ', 7);
CREATE TABLE #EmpParents (RoundNum int, ID int, ParentID int, PRIMARY KEY (RoundNum, ID));
The data for each 'round' (e.g., iteration number in the loop) is inserted into the #EmpParents table.
The approach is to a) First include the base job, and the parent job b) Then for the parent job, identify their parents Repeat b) until no more parent jobs are found.
The way it finds the 'parents of parents' is to look at the last 'round' in the loop to get the the parent ID for each job from that round; in the next round it uses these new parents to identify the next level of parents.
Note that I've done this with a WHILE loop because I think it's easier to read/understand. However, you could do this with a CURSOR or recursive CTE - but they are variations on a loop anyway.
/* Data calculations - Find parents, and parents of parents */
DECLARE @RoundNum int; -- Used for counting what iteration we're up to
DECLARE @NumInserted int; -- User for recording number of rows inserted
-- Insert initial data (own job and parent job)
INSERT INTO #EmpParents (RoundNum, ID, ParentID)
SELECT 0, ID, ID
FROM #Emps
UNION ALL
SELECT 1, ID, ParentID
FROM #Emps
WHERE ParentID IS NOT NULL;
SET @NumInserted = @@ROWCOUNT;
SET @RoundNum = 1;
-- Add 'parents of parents' until no more found
WHILE @NumInserted > 0
BEGIN
SET @RoundNum = 1;
INSERT INTO #EmpParents (RoundNum, ID, ParentID)
SELECT @RoundNum, #EmpParents.ID, #Emps.ParentID
FROM #EmpParents
INNER JOIN #Emps ON #EmpParents.ParentID = #Emps.ID
WHERE #EmpParents.RoundNum = @RoundNum - 1
AND #Emps.ParentID IS NOT NULL;
SET @NumInserted = @@ROWCOUNT;
END
In the final reporting, note that it uses the last column (ParentID) as the main column, and finds all the job IDs that correspond to it.
/* Reporting */
SELECT ParentEmp.Job, STRING_AGG(TeamEmp.Job, ', ') AS Team
FROM #EmpParents
INNER JOIN #Emps AS ParentEmp ON #EmpParents.ParentID = ParentEmp.ID
INNER JOIN #Emps AS TeamEmp ON #EmpParents.ID = TeamEmp.ID
GROUP BY #EmpParents.ParentID, ParentEmp.Job
ORDER BY #EmpParents.ParentID;
Here are the results
Job Team
------------------------------ ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CEO CEO, CFO, CTO, CMO, Accounting Manager, Payroll Manager, Data Manager, Software Manager, Data Analyst, Data Engineer
CFO Accounting Manager, Payroll Manager, CFO
CTO CTO, Data Manager, Software Manager, Data Analyst, Data Engineer
CMO CMO
Accounting Manager Accounting Manager
Payroll Manager Payroll Manager
Data Manager Data Manager, Data Analyst, Data Engineer
Software Manager Software Manager
Data Analyst Data Analyst
Data Engineer Data Engineer
Final note: There is no error checking in this and assumes a hierarchical structure (e.g., you cannot have a loop of parents e.g., my subordinate is my boss' boss). You may want to put some error checking into the loop/etc to control for that. A simple one would be that if RoundNum
gets to an impossibly impractical high number (e.g., 50) then it aborts with an error.
CodePudding user response:
You can use a recursive CTE for this.
First, get all of the employees, and for each one get its child rows if any, using the recursion. Note that we have not made any loop checking here, the assumption is that there are no loops.
Then aggregate it by the top ID that you started with, and join it back to the main table.
WITH cte AS (
SELECT e.ID, e.Job, e.ID AS TopID
FROM Employee e
UNION ALL
SELECT e.ID, e.Job, cte.TopID
FROM cte
JOIN Employee e ON e.ParentID = cte.ID
)
UPDATE e
SET MyTeam = grouped.Jobs
FROM Employee e
JOIN (
SELECT
cte.TopID,
STRING_AGG(cte.Job, ', ') AS Jobs
FROM cte
GROUP BY
cte.TopID
) grouped ON grouped.TopID = e.ID;
You can also enforce an ordering by calculating a Level
column in the CTE
WITH cte AS (
SELECT e.ID, e.Job, e.ID AS TopID, 1 AS Level
FROM Employee e
UNION ALL
SELECT e.ID, e.Job, cte.TopID, cte.Level 1
FROM cte
JOIN Employee e ON e.ParentID = cte.ID
)
UPDATE e
SET MyTeam = grouped.Jobs
FROM Employee e
JOIN (
SELECT
cte.TopID,
STRING_AGG(cte.Job, ', ') WITHIN GROUP (ORDER BY cte.Level) AS Jobs
FROM cte
GROUP BY
cte.TopID
) grouped ON grouped.TopID = e.ID;