Home > Back-end >  SQL server: finding if a client exists (and the account they own) in MULTIPLE columns of the account
SQL server: finding if a client exists (and the account they own) in MULTIPLE columns of the account

Time:03-08

I'm trying to find the list of customers that own (or co-own) an account and the account number.

Our CLIENT table contains our client data, identified with a clientId (table also contains demographics and contact details).

Our ACCOUNTS table contains account information where a clientId can exist in 1 or 5 colums (one account can have up to 5 co-owners).

My code looks like this, but is agonisingly slow. Is this the correct approach? Are there other approaches that are less expensive?

Select c.*, aa.accountNo from 
client as c, accounts as aa
where
exists
(select 1
from 
accounts as a
where
Cast(a.Account_Date as Date) >= '2010-11-15' and 
Cast(a.Account_Date as Date) <= '2017-04-24'
and c.clientId in (a.Owner1, a.Owner2, a.Owner3, a.Owner4, a.Owner5)
)

CodePudding user response:

The query you have explodes the data set and scans accounts twice. Also we shouldn't have to cast the column side to fit into a date range, there's now way that's beneficial (it might not always cause a scan but it's still not great). Try:

SELECT c.*, a.accountNo
FROM dbo.accounts AS a
CROSS APPLY
(
  VALUES(Owner1),(Owner2),(Owner3),(Owner4),(Owner5)
) AS ac(clientId)
INNER JOIN dbo.client AS c
ON c.clientId = ac.clientId
WHERE a.Account_Date >= '20101115'
  AND a.Account_Date <  '20170425';

CodePudding user response:

Your primary issue is lack of normalization. You should not have five Owner columns. Instead you should have a separate table of AccountOwner. Then you can simply join it.

This is effectively what you get from @AaronBetrand's answer, except that that one cannot be indexed as it is virtual.

Note also:

  • There was no need to access accounts again in the subquery.
  • Always use explicit join syntax, not the implicit , syntax.
  • Never cast a column in order to filter or join it. Always cast the constants instead. In this case you don't even need to cast them.
SELECT
  c.*,
  a.accountNo
FROM Client AS c
JOIN AccountOwner AS ao ON ao.OwnerId = c.ClientId
JOIN Accounts AS a
    ON a.AccountNo = ao.AccountNo
   AND a.Account_Date >= '20101115'
   AND a.Account_Date <  '20170425';

For this query to work efficiently, you will need the following indexes

Account (Account_Date, AccountNo)
AccountOwner (AccountNo, OwnerId)
Client (ClientId) INCLUDE (OtherColumns)

An alternative set of indexes may prove a better access strategy (you need to test)

Account (AccountNo) INCLUDE (Account_Date)
AccountOwner (OwnerId, AccountNo)
Client (ClientId) INCLUDE (OtherColumns)
  • Related