I am creating a face recognition system, but the search is very slow. Can you share how to speed up the search?
It takes about 6 seconds for 100,000 data items.
- MySQL
mysql> SHOW VARIABLES LIKE '%version%';
-------------------------- ------------------------------
| Variable_name | Value |
-------------------------- ------------------------------
| version | 8.0.29 |
| version_comment | MySQL Community Server - GPL |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
-------------------------- ------------------------------
- Table
CREATE TABLE `face_feature` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`f1` decimal(9,8) NOT NULL,
`f2` decimal(9,8) NOT NULL,
...
...
`f127` decimal(9,8) NOT NULL,
`f128` decimal(9,8) NOT NULL,
PRIMARY KEY (id)
);
- Data
mysql> SELECT count(*) FROM face_feature;
----------
| count(*) |
----------
| 110004 |
----------
mysql> SELECT * FROM face_feature LIMIT 1\G;
id: 1
f1: -0.07603023
f2: 0.13605964
...
f127: 0.09608927
f128: 0.00082345
- SQL
SELECT
id,
sqrt(
power(f1 - (-0.09077361), 2)
power(f2 - (0.10373443), 2)
...
...
power(f127 - (0.0778369), 2)
power(f128 - (0.00951046), 2)
) as distance
FROM
face_feature
ORDER BY
distance
LIMIT
1
;
- Result
---- --------------------
| id | distance |
---- --------------------
| 1 | 0.3376853491771237 |
---- --------------------
1 row in set (6.18 sec)
Update 1:
Changed from decimal(9,8)
to float(9,8)
Then, improved from approximately 4sec
to 3.26 sec
mysql> desc face_feature;
------- ------------ ------ ----- --------- ----------------
| Field | Type | Null | Key | Default | Extra |
------- ------------ ------ ----- --------- ----------------
| id | int | NO | PRI | NULL | auto_increment |
| f1 | float(9,8) | NO | | NULL | |
| f2 | float(9,8) | NO | | NULL | |
..
| f127 | float(9,8) | NO | | NULL | |
| f128 | float(9,8) | NO | | NULL | |
------- ------------ ------ ----- --------- ----------------
Update 2:
Changed from POWER(z, 2)
to z*z
Then, the result was changed from 3.26 sec
to 4.65 sec
SELECT
id,
sqrt(
((f1 - (-0.09077361)) * (f1 - (-0.09077361)))
((f2 - (0.10373443)) * (f2 - (0.10373443)))
((f3 - (0.00798536)) * (f3 - (0.00798536)))
...
...
((f126 - (0.07803915)) * (f126 - (0.07803915)))
((f127 - (0.0778369)) * (f127 - (0.0778369)))
((f128 - (0.00951046)) * (f128 - (0.00951046))
) as distance
FROM
face_feature
ORDER BY
distance
LIMIT
1
;
Update 3
I am looking into the usage of MySQL GIS.
How can I migrate from "float" to "points" in MySQL?
Update 4
I'm also looking at PostgreSQL because I can't find a way to handle 128 dimensions in MySQL.
CodePudding user response:
DECIMAL(9,8)
-- that's a lot of significant digits. Do you need that much precision?FLOAT
-- about 7 significant digits; faster arithmetic.POWER(z, 2)
-- probably a lot slower thanz*z
. (This may be the slowest part.)SQRT
-- In many situations, you can simply work with the squares. In this case:SELECT SQRT(closest) FROM ( SELECT -- leave out SQRT ... ORDER BY .. LIMIT 1 )
Here are some other thoughts. They are not necessarily relevant to the query being discussed:
Precise testing -- Beware of comparing for 'equal' Roundoff error is likely to make things unequal unexpectedly. Imprecise measurements add to the issue. If I measure something twice, I might get 1.23456789 one time and 1.23456788 the next time. (Especially at that level of "precision".
Trade complexity vs speed -- Use
ABS(a - b)
as the distance formula; find the 10 items closest in that way, then use the Euclidean distance to get the 'right' distance.Break the face into regions. Find which region the item is in, then check only the subset of the 128 points that are in that region. (Being near a boundary -- put some points in two regions.)
Think out of the box -- I'm not familiar with your facial recognition, so I have run out of mathematical tricks.
Switch to
POINTs
and aSPATIAL
index. It may be possible your task orders of magnitude faster. (This is probably not practical for 128-dimensional space.)