Home > OS >  How to avoid duplicates when finding row changes?
How to avoid duplicates when finding row changes?

Time:03-26

I am working to extract data when a column changes between user IDs in a single table. I am able to pull the change as well as the previous row (ID) using a Select Union query. For the previous row, I am getting more than one due to the parameters. Looking for suggestions on how to only retrieve a single previous row (ID). The query below is trying to retrieve a single row

| ID | Year | Event | ActivityDate | UserID
| 1  | 2020 | A     | 2020-12-01   | xxx
| 1  | 2021 | A     | 2021-03-01   | xyz
| 2  | 2020 | B     | 2021-01-01   | xxx
| 1  | 2022 | C     | 2021-10-01   | yyy
| 3  | 2021 | D     | 2021-12-01   | xxx

Select d.ID, d.Year, d.Event, d.ActivityDate, d.UserID
from tableA d
where
    d.year in ('2020','2021','2022')
    and d.event <>
    (select f.event
    from tableA f
    where
        f.year in ('2020','2021','2022')
        and d.id = f.id
        and d.activityDate < f.activityDate
    order by f.activityDate desc
    fetch first 1 row only
    )
;

I was hoping to retrieve the following

1, 2021, A, 2021-03-01, xyz

But I got

1, 2020, a, 2020-12-01, xxx
1, 2021, a, 2021-03-01, xyz

CodePudding user response:

I think analytic functions will help you to your answer.

The row_number() will get you the last row in a series of duplicates.

The count(id) will allow you to limit yourself to combinations that have more than one row.

 WITH
     aset
     AS
         (SELECT d.id
               , d.year
               , d.event
               , d.activitydate
               , d.userid
               , ROW_NUMBER ()
                     OVER (PARTITION BY id, event ORDER BY year DESC)    AS rn
               , COUNT (id) OVER (PARTITION BY id, event)                AS n
            FROM tablea d)
 SELECT *
   FROM aset
  WHERE rn = 1 AND n > 1;
  • Related