I have a STUDENT
table and need to update the STUDENT_ID
values by prefixing with the letter SS
followed by STUDENT_ID
value. For any duplicate STUDENT_ID
records, I should prefix the duplicate records as SS1
SS2
. Below is an example
Before Update:
NUM | STUDENT_ID |
---|---|
1 | 9234 |
2 | 9234 |
3 | 9234 |
4 | 3456 |
5 | 3456 |
6 | 789 |
7 | 956 |
After Update:
NUM | STUDENT_ID |
---|---|
1 | SS9234 |
2 | SS19234 |
3 | SS29234 |
4 | SS3456 |
5 | SS13456 |
6 | SS789 |
7 | SS956 |
Below is the query for updating the STUDENT_ID
for unique records.
update student set student_id = 'SS'||student_id ;
commit;
Need suggestion for updating the STUDENT_ID
for duplicate records. There are around 1 million duplicate records in the table and total volume is around 40 million. Appreciate for any inputs for performance enhancement.
CodePudding user response:
You can use a MERGE
statement correlated on the ROWID
pseudo-column and using the ROW_NUMBER()
analytic function:
MERGE INTO table_name dst
USING (
SELECT ROWID as rid,
ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY num) AS rn
FROM table_name
) src
ON (src.rid = dst.ROWID)
WHEN MATCHED THEN
UPDATE
SET student_id = 'SS' || CASE WHEN rn > 1 THEN rn - 1 END || dst.student_id;
Which, for the sample data:
CREATE TABLE table_name (NUM, STUDENT_ID) AS
SELECT 1, CAST('9234' AS VARCHAR2(20)) FROM DUAL UNION ALL
SELECT 2, '9234' FROM DUAL UNION ALL
SELECT 3, '9234' FROM DUAL UNION ALL
SELECT 4, '3456' FROM DUAL UNION ALL
SELECT 5, '3456' FROM DUAL UNION ALL
SELECT 6, '789' FROM DUAL UNION ALL
SELECT 7, '956' FROM DUAL;
Then after the MERGE
the table contains:
NUM | STUDENT_ID |
---|---|
1 | SS9234 |
2 | SS19234 |
3 | SS29234 |
4 | SS3456 |
5 | SS13456 |
6 | SS789 |
7 | SS956 |
CodePudding user response:
I'm sure there must be a better way, but this query can get the job done:
update t
set student_id = (
select new_student_id
from (
select x.*, 'SS' || case when rn = 1 then '' else '' || rn end
|| student_id as new_student_id
from (
select t.*, row_number() over(partition by student_id order by num) as rn
from t
) x
) y
where t.num = y.num
)
Result:
NUM STUDENT_ID
---- ----------
1 SS9234
2 SS29234
3 SS39234
4 SS3456
5 SS23456
6 SS789
7 SS956
See running example at db<>fiddle.
CodePudding user response:
Maybe you could do it without updating!?
I would probably try to :
CREATE NEW_TABLE AS
SELECT [do the "update" here] FROM OLD_TABLE;
- add indexes on new table
- add constraints on new table
- add anything else you need on new table (foreign keys, grants...)
and then
DROP TABLE OLD_TABLE;
-- and
RENAME NEW_TABLE To OLD_TABLE;
SELECT with your sample data:
WITH
tbl as
(
Select 1 "NUM", 9234 "STUDENT_ID" From Dual Union All
Select 2 "NUM", 9234 "STUDENT_ID" From Dual Union All
Select 3 "NUM", 9234 "STUDENT_ID" From Dual Union All
Select 4 "NUM", 3456 "STUDENT_ID" From Dual Union All
Select 5 "NUM", 3456 "STUDENT_ID" From Dual Union All
Select 6 "NUM", 789 "STUDENT_ID" From Dual Union All
Select 7 "NUM", 956 "STUDENT_ID" From Dual
)
Select
NUM,
CASE WHEN Count(NUM) Over(Partition By STUDENT_ID) = 1 THEN 'SS' || STUDENT_ID
ELSE 'SS' || Replace(Sum(1) Over(Partition By STUDENT_ID Order By NUM) - 1, 0, '') || STUDENT_ID
END "STUDENT_ID"
From
tbl
Order By NUM
Result:
NUM | STUDENT_ID |
---|---|
1 | SS9234 |
2 | SS19234 |
3 | SS29234 |
4 | SS3456 |
5 | SS13456 |
6 | SS789 |
7 | SS956 |