I'm looking for a faster way to calculate Euclidean distances in SQL.
- Problem I want to solve
The following "Euclidean distance calculation" is slow.
SELECT
id,
sqrt(
power(f1 - (-0.09077361), 2)
power(f2 - (0.10373443), 2)
...
...
power(f127 - (0.0778369), 2)
power(f128 - (0.00951046), 2)
) as distance
FROM
face_feature
ORDER BY
distance
LIMIT
1
;
- What I want to know
Can you share how to migrate from "float" to "points"?
I received the following advice, but I don't understand how.
Switch to POINTs and a SPATIAL index. It may be possible your task orders of magnitude faster.
- MySQL
mysql> SHOW VARIABLES LIKE '%version%';
-------------------------- ------------------------------
| Variable_name | Value |
-------------------------- ------------------------------
| version | 8.0.29 |
| version_comment | MySQL Community Server - GPL |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
-------------------------- ------------------------------
- Table
mysql> desc face_feature;
------- ------------ ------ ----- --------- ----------------
| Field | Type | Null | Key | Default | Extra |
------- ------------ ------ ----- --------- ----------------
| id | int | NO | PRI | NULL | auto_increment |
| f1 | float(9,8) | NO | | NULL | |
| f2 | float(9,8) | NO | | NULL | |
..
| f127 | float(9,8) | NO | | NULL | |
| f128 | float(9,8) | NO | | NULL | |
------- ------------ ------ ----- --------- ----------------
- Data
mysql> SELECT count(*) FROM face_feature;
----------
| count(*) |
----------
| 100003 |
----------
mysql> SELECT * FROM face_feature LIMIT 1\G;
id: 1
f1: -0.07603023
f2: 0.13605964
...
f127: 0.09608927
f128: 0.00082345
- Reference (My other question)
How can I make "euclidean distance calculation" faster in MySQL?
CodePudding user response:
Don't use
FLOAT(M,N)
it adds an extra rounding that only hurts various operations.FLOAT(9,8)
, if the numbers are near "1.0" will lose some precision. This is because there are only 24 bits of precision in anyFLOAT
.(m,n)
onFLOAT
andDOUBLE
has been deprecated (as useless and misleading) in newer versions of MySQL.There are helper functions to convert numeric strings to
POINT
values. Internally, aPOINT
contains twoDOUBLEs
. Hence the originalDECIMAL(9,8)
loses only a round-from-decimal-to-binary at the 53rd significant bit.
But the real question is about using SPATIAL
indexing when the universe has 128 dimensions. I don't think it will work. (I have not even heard of using SPATIAL
for 3 dimensions, though it should be practical.)