I'm trying to put together a query (select preferably) in SQL server that works with a single table. Said table is derived from two sets of data. Records where SET = OLD represent old data, records where SET = NEW represent new data. My intention is as follows:
- If record CODE = A, keep/include the record.
- If record CODE = C, keep/include the record but delete/exclude the corresponding record from the old set under the same ACT value.
- If record CODE = D, delete/exclude it along with its corresponding record from the old set under the same ACT value.
- If CODE = '' (blank/null), keep the record but only if it exists in the OLD set (meaning their isn't a corresponding record from the new set with the same ACT value)
What the table looks like before logic is applied:
ACT|STATUS |CODE|SET|VALUE
222| | |OLD|1
333| | |OLD|2
444| | |OLD|3
111|ADDED |A |NEW|4
222|CHANGED|C |NEW|5
333|DELETED|D |NEW|6
What the table should look like after logic is applied (end result)
ACT|STATUS |CODE|SET|VALUE
444| | |OLD|3
111|ADDED |A |NEW|4
222|CHANGED|C |NEW|5
While I can probably put together a select query to achieve the end result above I doubt it will run efficiently as the table in question has millions of records. What is the best way to do this without taking a long time to obtain the end result?
CodePudding user response:
Something like this. you will have to split your query and union.
--Old Dataset
SELECT O.*
FROM MyTable O
LEFT JOIN Mytable N ON O.ACT = N.ACT AND N.[SET] = 'NEW'
WHERE O.[SET] ='OLD'
AND ISNULL(N.CODE,'A') = 'A'
UNION
-- New records
SELECT N.*
FROM MyTable N
WHERE N.[SET] ='NEW'
AND CODE <> 'D'