Home > Software engineering >  Having difficulty generating distinct rows without duplicates in a SELF JOIN (SQL)
Having difficulty generating distinct rows without duplicates in a SELF JOIN (SQL)

Time:03-02

I'm pretty new to SQL and thought I'd be able to use it to create a list for my employer's clients. Unfortunately, there the majority of clients have more than one account and the file has a distinct row for each account.

I was trying to use a self join to create one row for each client with multiple columns for the accounts.

SELECT DISTINCT A.Account_Number AS Account_1, B.Account_Number AS Account_2, A.Client_Name
FROM client_table AS A, client_table AS B
WHERE A.Account_Number <> B.Account_Number
AND A.Client_Name = B.Client_Name
ORDER BY A.Client_Name;

Unfortunately, the results were such that I would get a table that looks like:

Account_1 Account_2 Client_name
000001 000002 Joe Shmo
000001 000003 Joe Shmo
000002 000003 Joe Shmo
000002 000001 Joe Shmo

I understand that for more than two accounts, I'll need more than two joins, but I haven't figured out how to do it.

Is there a way to prevent double entry?

I'm using BigQuery btw.

CodePudding user response:

You can use GROUP_CONCAT() to fetch a comma seperated list:

create table clients (
cname varchar(10),
account char(6));
insert into clients values
('Joe Shmo','000001'),
('Joe Shmo','000002'),
('Joe Shmo','000003'),
('Joe Shmo','000004');
select 
   cname Name, 
   group_concat(account )account_numbers
from clients
group by cname
order by cname;
Name     | account_numbers            
:------- | :--------------------------
Joe Shmo | 000001,000002,000003,000004

db<>fiddle here

Please see the following post for more information on group_concat in bigQuery: BigQuery GROUP_CONCAT and ORDER BY

CodePudding user response:

You can do this with LEFT JOIN. Repeat as necessary for however many accounts a user may have.

SELECT DISTINCT ct.Client_Name, ct1.Account_Number AS Account_1, ct2.Account_Number AS Account_2,
                ct3.Account_Number AS Account_3, ct4.Account_Number AS Account_4
FROM client_table ct
LEFT JOIN client_table ct1 ON ct1.Client_Name = ct.Client_Name AND ct1.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name)
LEFT JOIN client_table ct2 ON ct2.Client_Name = ct.Client_Name AND ct2.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct1.Account_Number)
LEFT JOIN client_table ct3 ON ct3.Client_Name = ct.Client_Name AND ct3.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct2.Account_Number)
LEFT JOIN client_table ct4 ON ct4.Client_Name = ct.Client_Name AND ct4.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct3.Account_Number)
  • Related