I have a table Visited
with 2 columns:
ID | City
ID
is an integer, City
is a string.
Note that none of the columns is a key by itself - we can have the same ID visiting several cities, and several different IDs in the same city.
Given a specific ID, I want to return all the IDs in the table that visited at least half of the places that the input ID did (not including themselves)
edit: We only count places that are the same. so if ID 1 visited cities a,b,c. ID 2 visited b,c,d. ID 3 visited c,d,e.
then for ID=1 we return only [2], because out of the three cities ID1 visited, ID3 visited only one
CodePudding user response:
Inner join the visited table with the list of cities visited by the specific id, then select ids with at least half of the number of rows when grouped by id.
with u as
(select city as visitedBySpecificId from visited where id = *specificId*),
v as
(select * from visited inner join u on city = visitedBySpecificId where id <> *specificId*)
(select id from v group by id having count(*) >= (select count(*) from u)/2.0)
CodePudding user response:
Join them and compare the counts.
create table suspect_tracking (id int, city varchar(30)) insert into suspect_tracking values (1, 'Brussels'), (1,'London'), (1,'Paris') , (1,'New York'), (1,'Bangkok'), (1, 'Hong Kong') , (1,'Dubai'), (1,'Singapoor'), (1,'Rome') , (1,'Macau'), (1, 'Istanbul'), (1,'Kuala Lumpur') , (1,'Dehli'), (1,'Tokyo'), (1,'Moscow') , (2,'New York'), (2,'Bangkok'), (2, 'Hong Kong') , (2,'Dubai'), (2,'Singapoor'), (2,'Rome') , (2,'Macau'), (2, 'Istanbul'), (2,'Kuala Lumpur') , (3,'Macau'), (3, 'Istanbul'), (3,'Kuala Lumpur') , (3,'Dehli'), (3,'Tokyo'), (3,'Moscow')
with cte_suspects as ( select id, city from suspect_tracking group by id, city ) , cte_prime_suspect as ( select distinct id, city from suspect_tracking where id = 1 ) , cte_prime_total as ( select id, count(city) as cities from cte_prime_suspect group by id ) select sus.id from cte_prime_suspect prime join cte_prime_total primetot on primetot.id = prime.id join cte_suspects sus on sus.city = prime.city and sus.id <> prime.id group by prime.id, sus.id, primetot.cities having count(sus.city) >= primetot.cities/2
| id | | -: | | 2 |
db<>fiddle here