We have some tables in an Oracle 19c database that were put in place many years ago for audit purposes. This 'audit' table is being generated by a trigger on update/delete to another table "widget". It is not checking to see if the new data matches the old data, so there are a lot of rows that are identical in everything but timestamp. I am not looking for comments on fixing the auditing as there many solutions to that mess. We are currently just stopping the bleeding until a better solution can be implemented.
My question is how to cut my existing audit data down to only the records that actually changed. These tables have grown to the point that it is impacting some of our processes.
auditWidget table data:
rowID | widgetID | widgetDesc | timestampOfTableUpdate |
---|---|---|---|
1 | 1 | A | 1/29/2016 12:12:55 PM |
2 | 2 | test | 1/29/2016 12:13:55 PM |
3 | 1 | A | 1/29/2016 12:14:55 PM |
4 | 1 | A | 1/29/2016 12:15:55 PM |
5 | 2 | test | 1/29/2016 12:16:55 PM |
6 | 1 | c | 1/29/2016 12:17:55 PM |
7 | 1 | b | 1/29/2016 12:18:55 PM |
8 | 2 | d | 1/29/2016 12:19:55 PM |
As you can see, nothing has changed between row 1,3 and 4, or between 2 and 5. These rows should never have been created. I am trying to find a way to identify/delete the entries that don't show any change in the widget table. My final data would look like this:
rowID | widgetID | widgetDesc | timestampOfTableUpdate |
---|---|---|---|
1 | 1 | A | 1/29/2016 12:12:55 PM |
2 | 2 | b | 1/29/2016 12:13:55 PM |
6 | 1 | c | 1/29/2016 12:17:55 PM |
7 | 1 | b | 1/29/2016 12:18:55 PM |
8 | 2 | d | 1/29/2016 12:19:55 PM |
LAG only looks at the previous row which might contain a different widgetID. There are thousands of widgetIDs. There is also potentially many identical rows per widgetID in a row. Are there any solutions out there short of looping through the data and conditionally doing something with it row by row? Any direction would be appreciated.
CodePudding user response:
Delete duplicates, then:
delete from auditwidget a
where a.rowid > (select min(b.rowid)
from auditwidget b
where b.widgetid = a.widgetid
and b.widgetdesc = a.widgetdesc
);
CodePudding user response:
The simple solution to only SELECT
the relevant data is to use LAG()
. For example:
select *,
from (
select t.*,
lag(widgetdesc) over(
partition by widgetid
order by timestampOfTableUpdate
) as prev_desc
from my_table t
) x
where widgetdesc <> prev_desc