I've been trying to run my query but it always seem to be running like forever. Would like to know how can I Improve on performance for this Had to add in the group by Clause as the left join seem to be creating duplication when viewing the results.
Hierarchical Query Code
with NextValue (equip_id, link_id, node_fr_id, node_to_id ) as
(
SELECT lm.equipment_id, l.link_id, l.node_fr_id_num, l.node_to_id_num
FROM road_details l
LEFT JOIN Equipment_details e ON l.link_id = e.link_id
Where 1=1
START WITH l.link_id = 2
CONNECT By NOCYCLE Prior l.node_to_id_num = l.node_fr_id_num
)
Select * from NextValue
-- Select Particular Equipment starting with 'DET' --
Where 1=1 and equip_id like 'DET%'
-- Remove Duplicate cause by Join clause --
Group by equip_id, link_id, node_fr_id, node_to_id;
Expected output:
equip_id | link_id | node_fr_id | node_to_id |
---|---|---|---|
DET_276627 | 990 | 1138 | 1141 |
DET_546012 | 1881 | 4856 | 2322 |
DET_546010 | 1980 | 2397 | 2437 |
DET_526526 | 13176 | 14259 | 14260 |
DET_526014 | 13724 | 14762 | 14763 |
DET_536513 | 114721 | 114445 | 114446 |
DET_526524 | 106130 | 105001 | 106402 |
DET_516043 | 18803 | 19196 | 19198 |
DET_526024 | 106319 | 106753 | 105919 |
DET_526515 | 105796 | 105762 | 103751 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526037 | 99221 | 96871 | 94382 |
DET_526510 | 99185 | 97133 | 97881 |
Current output:
equip_id | link_id | node_fr_id | node_to_id |
---|---|---|---|
DET_276627 | 990 | 1138 | 1141 |
DET_546012 | 1881 | 4856 | 2322 |
DET_546010 | 1980 | 2397 | 2437 |
DET_526526 | 13176 | 14259 | 14260 |
DET_526014 | 13724 | 14762 | 14763 |
DET_536513 | 114721 | 114445 | 114446 |
DET_526524 | 106130 | 105001 | 106402 |
DET_516043 | 18803 | 19196 | 19198 |
DET_526024 | 106319 | 106753 | 105919 |
DET_526515 | 105796 | 105762 | 103751 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526518 | 1061319 | 106514 | 104601 |
DET_526037 | 99221 | 96871 | 94382 |
DET_526510 | 99185 | 97133 | 97881 |
DET_526510 | 99185 | 97133 | 97881 |
DET_526510 | 99185 | 97133 | 97881 |
DET_526510 | 99185 | 97133 | 97881 |
for odd reason it creating duplicates and hence had to add in the Group clause to counter this issue.. but with the group clause added in i've notice the longer querying duration occuring.
Updates for commenters (not enough space to explain): Hi @HimanshuKandpal, let me just explain the query and what it is suppose to do. There are 2 tables, one is road_details and the other is an equipment_details. PK for the road detailsis Link_id, where equipment_id is the PK in the equipment_details and link_id is FK.
Brief explanation road details for context of what this hieratical query is aiming to do. Picture a map of inter connecting roads, e.g. a single road is made up of multiple links (Blocks) each block has a fr_id (head) and to_id (tail) [link_id 1 -> link_id 2 -> link_id 3] where link_id1.to_id = link_id2.fr_id
as they are linked that way. The query is supposed to find out what is the next connecting link based on the criteria.
2nd criteria, with each link can have multiple equipment on it or non at all. If a user were to select a certain equipment type, as shown in my where clause are they able to know what is the next or previous link ID of a particular equipment with support from hieratical query.
CodePudding user response:
The columns you SELECT
are all from the road_details
table (assuming lm.equipment_id
is a typo as there is no lm
table alias and it should be l.equipment_id
) and the WHERE
filter is a column from the road_details
table and the START WITH
and CONNECT BY
conditions are all using columns from the road_details
table.
You LEFT OUTER JOIN
the Equipment_details
table which means that having matching rows is optional and you NEVER use any of its columns to either filter the input or generate the output so the Equipment_details
table appears to be irrelevant to the output. At best, the SQL engine will optimise query and remove the join; at worst, there are multiple matching rows in the Equipment_details
table for each row in the road_details
table which will generate duplicates and then this number of duplicates will be exponentially magnified as the paths are followed deeper into the hierarchy.
For example, if you have the path without equipment:
A -> B -> C
And compare it to having the same path with 2 equipment on each road:
A (with equipment A1) -> B (with equipment B1) -> C (with equipment C1)
A (with equipment A1) -> B (with equipment B1) -> C (with equipment C2)
A (with equipment A1) -> B (with equipment B2) -> C (with equipment C1)
A (with equipment A1) -> B (with equipment B2) -> C (with equipment C2)
A (with equipment A2) -> B (with equipment B1) -> C (with equipment C1)
A (with equipment A2) -> B (with equipment B1) -> C (with equipment C2)
A (with equipment A2) -> B (with equipment B2) -> C (with equipment C1)
A (with equipment A2) -> B (with equipment B2) -> C (with equipment C2)
The query will generate 8 duplicates that terminate a C
(and it will get worse as the hierarchy deepens or if multiple paths reach the same junction).
You will still need to filter out duplicates as you can follow roads A -> B -> C
and also A -> D -> C
so C
would be reached by two distinct paths without any duplication from equipment but including equipment will make the duplicates much, much more prevalent.
Since you never use the Equipment_details
table, do not include it in the query and just use the road_details
table:
SELECT DISTINCT
l.equipment_id,
l.link_id,
l.node_fr_id_num,
l.node_to_id_num
FROM road_details l
WHERE l.equipment_id like 'DET%'
START WITH
l.link_id = 2
CONNECT BY NOCYCLE
PRIOR l.node_to_id_num = l.node_fr_id_num
(Note: using either GROUP BY
with all the columns or DISTINCT
will be doing the same job and performance should be similar for both and can be used interchangeably; you should profile each version and check. DISTINCT
may be easier for future developers to comprehend what you are trying to achieve.)
If the equipment_id
column is in the Equipment_details
table then you can generate the road hierarchy and then later join the equipment:
SELECT e.equipment_id,
l.*
FROM ( SELECT DISTINCT
link_id,
node_fr_id_num,
node_to_id_num
FROM road_details
START WITH
link_id = 2
CONNECT BY NOCYCLE
PRIOR node_to_id_num = node_fr_id_num
) l
INNER JOIN Equipment_Details e
ON (l.link_id = e.link_id)
WHERE e.equipment_id like 'DET%'
or:
SELECT e.equipment_id,
l.*
FROM ( SELECT DISTINCT
link_id,
node_fr_id_num,
node_to_id_num
FROM road_details
START WITH
link_id = 2
CONNECT BY NOCYCLE
PRIOR node_to_id_num = node_fr_id_num
) l
LEFT OUTER JOIN Equipment_Details e
ON ( l.link_id = e.link_id
AND e.equipment_id like 'DET%')