I store matrix data in sqlite with a schema of the form:
create table mat_long(a varchar, b varchar, float val)
I then can store the following matrix:
x y z
x 4 1 2
y 1 4 3
z 2 3 4
using my table, in the following long-format way:
a | b | val |
---|---|---|
x | x | 4 |
x | y | 1 |
x | z | 2 |
y | x | 1 |
y | y | 4 |
y | z | 3 |
z | x | 2 |
z | y | 3 |
z | z | 4 |
As you noticed, this matrix is symmetric so I would need to store about only half of it. The problem is I already have this data in a table and must delete the symmetric part. A query to find all the symmetric duplicates would be:
select *
from mat_long
where (a, b) in (select b, a
from mat_long) and
a <> b
I would like to delete half of them, the correct ones but not sure how to do that in sqlite or sql for that matter.
I would like to run a delete such that my table above ends like this:
a | b | val |
---|---|---|
x | x | 4 |
x | y | 1 |
x | z | 2 |
y | y | 4 |
y | z | 3 |
z | z | 4 |
I tried the following but it hangs sqlite:
select t1.*
from mat_long as t1
where (t1.a, t1.b) in (select t2.b, t2.a
from mat_long as t2
where t1.a <> t2.a and
t1.b <> t2.b) and
t1.a <> t1.b
CodePudding user response:
Since you know that your table is symmetric, you can filter out values whose "a" is less or equal than the "b" value:
SELECT *
FROM tab
WHERE a <= b;
Check the demo here.
If instead you want to use a DELETE
statement, you can do it as follows:
DELETE FROM tab
WHERE a > b;
Check the demo here.