I am trying to find only one occurrence for each customer.
However, in my database I have customers that have been added twice (following an ERP migration)
Currently,
If I try to find a customer that has two occurrences, I have to keep the customer that has a 'C' in the "customer_id" column
In this example we have "Manu Johns" who appears 2x so we must keep the one who has a 'C' in the customer_id column in the final table.
If I only find one occurrence of this customer. But, which does not have a 'C' in the customer_id column. We have to add it as is in the final table
In this example we have "Mathieu Wainers" which appears only once we keep it as it is in the final table
Which query would allow me to have this result : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=9484f43c0a6c1ccdae7d659ca53e1eab
CREATE TABLE PersonsInitial (
tel int,
firstname varchar(255),
lastname varchar(255),
Customer_ID varchar(255)
);
insert into PersonsInitial(tel,firstname,lastname,Customer_ID) values
('01234','Manu','Johns','456'),
('01234','Manu','Johns','C456'),
('21234','Fernand','Wajk','C389'),
('13554','Mathieu','Wainers','4683');
select distinct tel, firstname, lastname, customer_id from PersonsInitial
--if there is a person with the same tel number chose the customer id with 'C'
--if I don't have the choice add the customer without C
CREATE TABLE PersonsFinal (
tel int,
firstname varchar(255),
lastname varchar(255),
Customer_ID varchar(255)
);
insert into PersonsFinal(tel,firstname,lastname,Customer_ID) values
('01234','Manu','Johns','C456'),
('21234','Fernand','Wajk','C389'),
('13554','Mathieu','Wainers','4683');
select distinct tel, firstname, lastname, customer_id from PersonsFinal
CodePudding user response:
You may rank them first based on whether it has or not "C" in the customer id. That's why cte is here for.
with cte as (select row_number() over (partition by tel, firstname, lastname order by case when left(customer_id, 1) = 'C' then 0 else 1 end) rn,
p.*
from PersonsInitial p)
select *
from cte
where rn = 1; <-- selects only those with "C" or those for that there is no "C" lines
CodePudding user response:
There are multiple solutions to this problem. You can, for example use OUTER APPLY. ie:
insert into PersonsFinal(tel,firstname,lastname,Customer_ID)
select distinct pi1.tel, pi1.firstname, pi1.lastname, coalesce(pi.Customer_ID, pi1.Customer_ID) Customer_Id
from PersonsInitial pi1
outer apply (select top(1) *
from PersonsInitial pi2
where pi1.tel = pi2.tel
and pi1.firstname = pi2.firstname
and pi1.lastname = pi2.lastname
and pi2.Customer_ID like 'C%') pi;
CodePudding user response:
Another solution :
WITH CTE AS
(
SELECT tel,
firstname,
lastname,
Customer_ID,
ROW_NUMBER() OVER (PARTITION BY Customer_ID ORDER BY COL_LENGTH('PersonsInitial','Customer_ID') DESC) AS RowNumber
FROM PersonsInitial
)
SELECT tel,
firstname,
lastname,
Customer_ID
FROM CTE
WHERE RowNumber = 1