I have a table with the following structure :
create table test_18Nov ( account_id nvarchar(12)
, account_name nvarchar(25)
, zip_legacy_file nvarchar(5)
, Region_legacy_file nvarchar(30)
, zip_new_source nvarchar(5)
, Region_new_source nvarchar(30)
)
INSERT INTO test_18Nov VALUES ('S1018', 'John Smith', '32221', 'R087-Jacksonville', '33803', 'R026-Lakeland')
INSERT INTO test_18Nov VALUES ('S1018', 'John Smith', '33606', 'R011-Tampa', '32220', 'R087-Jacksonville')
INSERT INTO test_18Nov VALUES ('S1018', 'John Smith', '33803', 'R026-Lakeland', '33606', 'R011-Tampa')
INSERT INTO test_18Nov VALUES ('AC054', 'David Thompson', '33606', 'R011-Tampa', '32205', 'R087-Jacksonville')
INSERT INTO test_18Nov VALUES ('AC054', 'David Thompson', '33870', 'R058-Sebring', '33606', 'R011-Tampa')
INSERT INTO test_18Nov VALUES ('AC054', 'David Thompson', '33610', 'R011-Tampa', '33870', 'R058-Sebring')
INSERT INTO test_18Nov VALUES ('AC077', 'Stacey Leigh', '34950', 'R043-Fort Pierce', '34982', 'R043-Fort Pierce')
INSERT INTO test_18Nov VALUES ('AC077', 'Stacey Leigh', '33610', 'R011-Tampa', '34950', 'R043-Fort Pierce')
I have to generate a pseudo column for all rows with either Yes or No. For an account id if the legacy region(or regions) is present in new source region(or regions) then the pseudo column will display 'No'. If for an account a new source region (or regions) is not present in legacy region (or regions) then the account will be considered as a move and the pseudo column will display 'Yes'. In the above data set AC054 is the only account id that should have the pseudo column value as 'Yes' because R087-Jacksonville (a new source region) is not present in the legacy region list.
The expected output should be :
account_id | account_name | Region_legacy_file | Region_new_source | Will the account move? |
-------------------------------------------------------------------------------------------------------------
S1018 | John Smith | R087-Jacksonville | R026-Lakeland | No |
-------------------------------------------------------------------------------------------------------------
S1018 | John Smith | R011-Tampa | R087-Jacksonville | No |
-------------------------------------------------------------------------------------------------------------
S1018 | John Smith | R026-Lakeland | R011-Tampa | No |
-------------------------------------------------------------------------------------------------------------
AC054 | David Thompson | R011-Tampa | R087-Jacksonville | Yes |
-------------------------------------------------------------------------------------------------------------
AC054 | David Thompson | R058-Sebring | R011-Tampa | Yes |
-------------------------------------------------------------------------------------------------------------
AC054 | David Thompson | R011-Tampa | R058-Sebring | Yes |
-------------------------------------------------------------------------------------------------------------
AC077 | Stacey Leigh | R043-Fort Pierce | R043-Fort Pierce | No |
-------------------------------------------------------------------------------------------------------------
AC077 | Stacey Leigh | R011-Tampa | R043-Fort Pierce | No |
-------------------------------------------------------------------------------------------------------------
I thought of using NOT EXISTS clause but that will return only the rows where new source region is not found in the list of legacy regions for an account id - which is not of any help to me in this situation. Only way I can think of is using CASE WHEN EXISTS but have not been able to get it to work. If there is any other way please do share.
EDIT : Why AC054 has the pseudo column as 'Yes'? - There are 3 new source regions for account id AC054 and they are R087, R011 and R058. Out of these 3 regions 2 regions are found in legacy regions for account id AC054 and they are R011 and R058. So R087 is not a part of legacy region list for account id AC054 and hence the pseudo column has 'Yes'. For the other 2 account ids that is not the case because all the new source regions are found in the legacy region list.
CodePudding user response:
I'm checking your information and I did a sample what a sub query CheckLegacy will return the account what should be moved, follows the example:
WITH test_18Nov AS (
SELECT * FROM (
VALUES
('S1018', 'John Smith', '32221', 'R087-Jacksonville', '33803', 'R026-Lakeland'),
('S1018', 'John Smith', '33606', 'R011-Tampa', '32220', 'R087-Jacksonville'),
('S1018', 'John Smith', '33803', 'R026-Lakeland', '33606', 'R011-Tampa'),
('AC054', 'David Thompson', '33606', 'R011-Tampa', '32205', 'R087-Jacksonville'),
('AC054', 'David Thompson', '33870', 'R058-Sebring', '33606', 'R011-Tampa'),
('AC054', 'David Thompson', '33610', 'R011-Tampa', '33870', 'R058-Sebring'),
('AC077', 'Stacey Leigh', '34950', 'R043-Fort Pierce', '34982', 'R043-Fort Pierce'),
('AC077', 'Stacey Leigh', '33610', 'R011-Tampa', '34950', 'R043-Fort Pierce')
) AS _ (account_id,account_name, zip_legacy_file,Region_legacy_file,zip_new_source,Region_new_source)
),
--formatting the query for the field I need
IdAndNewLegacy as (
SELECT account_id, Region_new_source FROM test_18Nov
),
--check if some new legacy region is not in Region_legacy_file
CheckLegacy as (
SELECT I.account_id, T.account_id as id FROM IdAndNewLegacy as I
LEFT JOIN test_18Nov as T ON I.account_id = T.account_id and I.Region_new_source = T.Region_legacy_file
WHERE T.account_id is null
GROUP BY I.account_id, T.account_id
)
--Query to present the data
SELECT
t.*,
CASE WHEN c.account_id is not null then 'Yes' ELSE 'No' END as [Will the account move?]
FROM
test_18Nov as t
LEFT JOIN CheckLegacy as c ON T.account_id = C.account_id
The result:
Best Regards
CodePudding user response:
Use a CASE expression (Or IIF()
) to get YES/NO (using a NOT EXISTS check) for each Row.
Then use MAX(expression) OVER (PARTITION BY account_id)
to give all account rows YES is ANY of them are yes.
SELECT
t.*,
MAX(
IIF(
NOT EXISTS (
SELECT *
FROM Test_18Nov
WHERE account_id = t.account_id
AND region_legacy_file = t.region_new_source
),
'YES',
'NO'
)
) OVER (PARTITION BY t.account_id)
FROM
test_18Nov AS t
ORDER BY
t.account_id,
t.region_new_source
DEMO: https://dbfiddle.uk/rOS1OoFe
CodePudding user response:
You could use a SQL subquery, which means that for each row you execute another query that looks up a result, for example:
SELECT name, id, location, (SELECT TOP 1 'exists' FROM location l WHERE l.location = c.location)
FROM customer c