Home > Back-end >  Select with group by time nearest records
Select with group by time nearest records

Time:04-06

I have a SQL table including records where same code can be inserted multiple times as shown below. The challenge is that I need extract only a single record (can be the first, last or either whatever in the middle) if the same code appear more than once in a 60 seconds period. For example

|  ID  |          DATE           |  CODE   | 
 ------ ------------------------- --------- 
| 1715 | 2022-04-04 19:30:59.593 | KXI235  |    /* keep this */
| 1716 | 2022-04-04 19:30:59.710 | CLH827  |    /* keep this */
| 1717 | 2022-04-04 19:31:00.490 | CLH827  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1718 | 2022-04-04 19:31:00.973 | AA295WG |    /* keep this */
| 1719 | 2022-04-04 19:31:01.207 | CLH827  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1720 | 2022-04-04 19:31:01.347 | LIC303  |    /* keep this */
| 1721 | 2022-04-04 19:31:01.470 | AC435AD |    /* keep this */
| 1722 | 2022-04-04 19:31:01.853 | AC435AD |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1723 | 2022-04-04 19:31:02.363 | AA295WG |    /* keep this */
| 1724 | 2022-04-04 19:31:02.847 | AC435AD |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1725 | 2022-04-04 19:31:04.237 | PHG644  |    /* keep this */
| 1726 | 2022-04-04 19:31:06.467 | BEA9628 |    /* keep this */
| 1727 | 2022-04-04 19:31:08.467 | AC105GI |    /* keep this */
| 1728 | 2022-04-04 19:31:09.447 | AC167LX |    /* keep this */
| 1729 | 2022-04-04 19:31:09.380 | A127KOA |    /* keep this */
| 1730 | 2022-04-04 19:31:09.843 | AC167LX |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1731 | 2022-04-04 19:31:11.200 | NDF020  |    /* keep this */
| 1732 | 2022-04-04 19:31:21.440 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1733 | 2022-04-04 19:31:31.947 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1734 | 2022-04-04 19:31:42.073 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1735 | 2022-04-04 19:31:53.207 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1736 | 2022-04-04 19:32:02.947 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1737 | 2022-04-04 19:32:04.233 | NRE781  |    /* keep this */
| 1738 | 2022-04-04 19:32:06.843 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1739 | 2022-04-04 19:32:09.077 | NDF020  |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1740 | 2022-04-04 19:32:10.347 | AE362EO |    /* keep this */
| 1741 | 2022-04-04 19:32:10.097 | AC435AD |    /* Keep this because same CODE exist on ID 1721 record but more than 60 seconds difference between DATE value */
| 1742 | 2022-04-04 19:32:10.940 | AE362EO |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1743 | 2022-04-04 19:32:11.580 | AE362EO |    /* Discard because same CODE exists with DATE interval  less than 60 seconds */
| 1744 | 2022-04-04 19:32:12.443 | OHY127  |    /* keep this */

So after he query the expected result would be

|  ID  |          DATE           |  CODE   | 
 ------ ------------------------- --------- 
| 1715 | 2022-04-04 19:30:59.593 | KXI235  | 
| 1716 | 2022-04-04 19:30:59.710 | CLH827  | 
| 1718 | 2022-04-04 19:31:00.973 | AA295WG |    
| 1720 | 2022-04-04 19:31:01.347 | LIC303  |    
| 1721 | 2022-04-04 19:31:01.470 | AC435AD |    
| 1723 | 2022-04-04 19:31:02.363 | AA295WG |    
| 1725 | 2022-04-04 19:31:04.237 | PHG644  |    
| 1726 | 2022-04-04 19:31:06.467 | BEA9628 |    
| 1727 | 2022-04-04 19:31:08.467 | AC105GI |    
| 1728 | 2022-04-04 19:31:09.447 | AC167LX |    
| 1729 | 2022-04-04 19:31:09.380 | A127KOA |    
| 1731 | 2022-04-04 19:31:11.200 | NDF020  | 
| 1737 | 2022-04-04 19:32:04.233 | NRE781  | 
| 1740 | 2022-04-04 19:32:10.347 | AE362EO |    
| 1741 | 2022-04-04 19:32:10.097 | AC435AD |    
| 1744 | 2022-04-04 19:32:12.443 | OHY127  |  

Note that both on ID 1741 and ID 1721 same CODE is listed by time difference between ID 1741 and the last occurence of the same CODE (ID1724) is more than 60 seconds so its is correct that this record is in the select output.

Can anyone help me with the right query?

Thanks!!!!

CodePudding user response:

This looks to be a use-case for not exists. I think AA295WG in your expected results is wrong (only 2 seconds apart) but does the following work for you?

select * 
from t
where not exists (
  select * from t t2
  where t2.code = t.code
   and t2.id < t.id 
   and DateDiff(second, t2.date, t.date) <= 60
);

Demo Fiddle

  • Related