Home > Back-end >  Cleaning up bad Delete/Inserts in Teradata Temporal Tables
Cleaning up bad Delete/Inserts in Teradata Temporal Tables

Time:10-02

Looking to find a way to clean up some data in a temporal table where bad Deletes/Inserts happened throughout the years. I was able to use NORMALIZE when the period values overlapped/met, however, I am now running into an issue where the periods do not overlap/meet. I would like to consolidate the periods.

Before: Before

After: After

My original plan was to take the min(begin(eff_vt_dttm)) and max(end(eff_vt_dttm)) and create a new period after grouping by all columns. Problem is the edge case where the row values repeat, for example rows 1 and 2 should consolidate, but don't want rows 5 and 6 to consolidate with rows 1 and 2. Rows 1 and 2 should consolidate into their own row, rows 5 and 6 should consolidate into their own row as well. I was thinking of creating groupings so that I can take the min/max of the group, but had a hard time.

Grouping Idea: Grouping Idea

Wondering if anyone knows a good approach for this problem.

Thanks!

CodePudding user response:

This is untested, for the reasons noted in my comment on your post.

It uses a concept called gaps-and-islands to identify consecutive records where the values remain unchanged. These form you 'groups' in the final image.

WITH
  sorted AS
(
  SELECT
    your_data.*,
    ROW_NUMBER() OVER (PARTITION BY id
                           ORDER BY BEGIN(eff_vt_dttm)
                      )
                        AS seq_num_id,
    ROW_NUMBER() OVER (PARTITION BY id, col1, col2
                           ORDER BY BEGIN(eff_vt_dttm)
                      )
                        AS seq_num_id_col1_col2
  FROM
    your_data
)
SELECT
  s.id,
  s.col1,
  s.col2,
  PERIOD(
    MIN(begin(s.eff_vt_dttm)),
    MAX(end(s.eff_vt_dttm))
  )
    AS eff_vt_dttm
FROM
  sorted  AS s
GROUP BY
  s.id,
  s.col1,
  s.col2,
  s.seq_num_id - s.seq_num_id_col1_col2
ORDER BY
  s.id,
  MIN(begin(s.eff_vt_dttm))
  • Related