Home > Blockchain >  Custom query to fetch all entries of a table and that only contains first of many duplicates based o
Custom query to fetch all entries of a table and that only contains first of many duplicates based o

Time:11-30

I have a Location model and the table looks like

   id |      name      | vin |     ip_address    |         created_at         |         updated_at    ---- ---------------- ------ --------------- ---------------------------- ------------------------
    0 | default |    0 | 0.0.0.0/0          | 2021-11-08 11:54:26.822623 | 2021-11-08 11:54:26.822623
    1 | admin  |    1 | 10.108.150.143    | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
    2 | V122  |    122| 10.108.150.122    | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885  
    3 | V123 |    123| 10.108.150.123   | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
    4 | V124  |    124| 10.108.150.124   | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
    5 | V122  |    122| 10.108.150.122    | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
    6 | V125  |    122| 10.108.150.125   | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885

My method in the Location model

   def self.find_all_non_duplicate
     return self.find(:all, :conditions => "id <> 1")
   end

I want to fetch all entries of the locations table except the entry with id = 1 and that contains only the first entry of many duplicates based on the column ip_address.

Since ip_address of id = 2 and id = 5 is duplicate. I want to keep the first entry of many duplicates i.e., id = 2.

The expected result is

 id |      name      | vin |     ip_address    |         created_at         |         updated_at           ---- ---------------- ------ --------------- ---------------------------- ------------------------
 0 | default |    0 | 0.0.0.0/0          | 2021-11-08 11:54:26.822623 | 2021-11-08 11:54:26.822623
 2 | V122  |    122| 10.108.150.122    | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
 3 | V123 |    123| 10.108.150.123   | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
 4 | V124  |    124| 10.108.150.124   | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
 6 | V125  |    122| 10.108.150.125   | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885

The entries with id's 1 and 5 to be ignored

CodePudding user response:

Presumably, this should work:

select("DISTINCT(name), *").where.not(id: 1)

CodePudding user response:

What you need is a distinct on added to RoR quite recently here. In a raw SQL from it will look like this:

select distinct on (ip_address) * 
from test 
where id<>1 
order by ip_address,created_at;

Which, if I navigate your syntax right, would translate to RoR's

   def self.find_all_non_duplicate
     return self.find(:all, :conditions => "id <> 1").distinct-on(:ip_address)
   end

or, in the absence of that recent change

   def self.find_all_non_duplicate
     return self.find(:all, :conditions => "id <> 1").select("distinct on (ip_address) *")
   end

Full db-side test:

drop table if exists test cascade;
create table test (
    id serial primary key,
    name text,
    vin integer,
    ip_address inet,
    created_at timestamp,
    updated_at timestamp);
insert into test 
(id,name,vin,ip_address,created_at,updated_at)
values
(0,'default', 0,'0.0.0.0/0'::inet,'2021-11-08 11:54:26.822623'::timestamp,'2021-11-08 11:54:26.822623'::timestamp),
(1,'admin',   1,'10.108.150.143'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(2,'V122',    122,'10.108.150.122'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(3,'V123',    123,'10.108.150.123'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(4,'V124',    124,'10.108.150.124'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(5,'V122',    122,'10.108.150.122'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp),
(6,'V125',    122,'10.108.150.125'::inet,'2021-11-08 11:54:26.82885'::timestamp,'2021-11-08 11:54:26.82885'::timestamp);

select distinct on (ip_address) * 
from test where id<>1 
order by ip_address,created_at;
-- id |  name   | vin |   ip_address   |         created_at         |         updated_at
------ --------- ----- ---------------- ---------------------------- ----------------------------
--  0 | default |   0 | 0.0.0.0/0      | 2021-11-08 11:54:26.822623 | 2021-11-08 11:54:26.822623
--  2 | V122    | 122 | 10.108.150.122 | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
--  3 | V123    | 123 | 10.108.150.123 | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
--  4 | V124    | 124 | 10.108.150.124 | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
--  6 | V125    | 122 | 10.108.150.125 | 2021-11-08 11:54:26.82885  | 2021-11-08 11:54:26.82885
--(5 rows)

  • Related