I´m currently working stuck on a SQL issue (well, mainly because I can´t find a way to google it and my SQL skills do not suffice to solve it myself)
I´m working on a system where documents are edited. If the editing process is finished, users mark the document as solved. In the MSSQL database, the corresponding row is not updated but instead, a new row is generated. Thus, every document that has been processed has [e.g.: should have] multiple rows in the DB.
See the following situation:
ID | ID2 | Solved | Steps | Process | Solved |
---|---|---|---|---|---|
1 | 1 | yes | Three | ATAT | AF |
2 | 2 | yes | One | ATAT | FR |
2 | 3 | yes | One | ATAT | EG |
2 | 4 | yes | One | ATAT | AF |
3 | 5 | no | One | ABAT | AF |
4 | 6 | yes | One | ATAT | FR |
5 | 7 | no | One | AVAT | EG |
6 | 8 | yes | Two | SATT | FR |
6 | 9 | yes | Two | SATT | EG |
6 | 10 | yes | Two | SATT | AF |
Now: I need to find out which documents have not been processed yet. In order to achieve this, I have to remove the rows with the IDs 2 | 4 (because the system sadly isn´t too reliable I guess) | and 6.
What I could do is filter the duplicates afterwards, in python/js/whatever. But I am curious whether I can remove these rows directly in SQL as well.
CodePudding user response:
I'm not sure I understand your question correct:
...every document that has been processed has [...] multiple rows in the DB
I need to find out which documents have not been processed yet
So it seems you need to find unique documents with no versions, this could be done using a GROUP BY
with a HAVING
clause:
SELECT
Id
FROM dbo.TableName
GROUP BY Id
HAVING COUNT(*) = 1
CodePudding user response:
If you need to delete all rows where every id doesn't have any "Solved = 'no'", you can use a DELETE
statement that will exclude all "id" values that have at least one "Solved = 'no'" in the corresponding rows.
DELETE FROM tab
WHERE id NOT IN (SELECT id FROM tab WHERE Solved1 = 'no');
Check the demo here.