I am writing a data export where I need to return one row from a selection where there may be multiple rows. In this case, the second table is the telephone_current table. This table includes a row for several telephone types (CA, MA, PR, etc.), and they are not in any particular order. If the individual has a CA, I need to include that record; if not, then I would use either type MA or PR.
The query below works, technically, but it will run excruciatingly slow (10 minutes or more).
I need advice to fix this query to get one row (record) per individual. The slowdown occurs when I include the self join telephone_current tc. Note. I've also moved the AND into the WHERE clause, which runs with the time delay.
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType
FROM isu_general_person igp
JOIN person_detail pd ON igp.person_uid = pd.person_uid
JOIN telephone_current tc ON igp.person_uid = tc.entity_uid
AND tc.phone_number = (
SELECT p.phone_number
FROM telephone_current p
WHERE tc.entity_uid = p.entity_uid
ORDER BY phone_type
FETCH FIRST 1 ROW ONLY
)
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL
;
CodePudding user response:
Example using row_number() analytic and a common table expression. This limits to one phone per person by creating a partition/group of numbers under a given Entity_Uid orders this by a case expression and then assigns row number based on that case expression defined order then phone type, then phone number. The row number is then used to limit the results to just 1 phone number.
WITH BaseData as (
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType,
row_number() over (PARTITION BY Entity_Uid ORDER BY case when phone_type ='CA' then 1
when phone_Type in ('MA','PR') then 2
else 3 end, phone_Type, Phone_number) RN
FROM isu_general_person igp
JOIN person_detail pd ON igp.person_uid = pd.person_uid
JOIN telephone_current tc ON igp.person_uid = tc.entity_uid
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL)
SELECT *
FROM BaseData
WHERE RN = 1
;
Example As cross apply: Cross apply avoids the need of the analytic and basically says; hey; for each matching igp.person_uid = xtc.entity_uid, get the first record based on the order defined in the subquery. quit when you've got the 1st record for each user
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType,
FROM isu_general_person igp
JOIN person_detail pd
ON igp.person_uid = pd.person_uid
CROSS APPLY (SELECT xtc.phone_number
FROM telephone_current xtc
WHERE igp.person_uid = xtc.entity_uid
ORDER BY case when phone_type = 'CA' then 1
when phone_Type in ('MA','PR') then 2
else 3 end,
phone_Type,
Telephone_current
FETCH FIRST 1 ROW ONLY) tc
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL
CodePudding user response:
We did identify problem with the index on the telephone_current table. Once that was resolved, both the versions provided by xQbert worked to provide the single-row result for each individual. The version using WITH BaseData ran in approximately 12 seconds. However, this version returned all rows in 2.4 seconds.
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType
FROM isu_general_person igp
JOIN person_detail pd
ON igp.person_uid = pd.person_uid
CROSS APPLY (SELECT xtc.phone_number_combined
FROM telephone xtc
WHERE igp.person_uid = xtc.entity_uid
ORDER BY case when phone_type = 'CA' then 1
when phone_Type in ('MA','PR') then 2
else 3 end,
phone_Type,
phone_number_combined
FETCH FIRST 1 ROW ONLY) tc
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL