MySQL: 8.0.31 - Large gaps (thousands) in ID Field after few months of usage-CodePudding

We see large gaps in primary key column with auto_increment in mysql 8.0.31 release version in multiple tables (with default - innodb_autoinc_lock_mode = 2 (“interleaved” lock mode)). Also I understand some smaller gaps can happen in this mode.

We have 'order' table (id bigint) and 'order_line_items' table (id bigint).

TABLE_NAME          | CURRENT AUTO_INCREMENT    |  TABLE ROW COUNT
order               |   27504                   |        1367
order_line_items    |  430930                   |       34970

We have a total of 1367 order rows with current auto increment value of '27504', and 34970 (order line item rows) with current auto increment value of '430930'. We save, order and order_line_items all within the same transaction in a API call (there is possibility for error which can lead to transaction rollback).

Gaps:

Order table gaps continued to happen for 1-2 day once we noticed and went back to incrementing by '1' again as of today. Jumps are happening together in clusters continuously (check below) and went back to incrementing by '1'.

Order table gap details:

id    |  difference from previous 'id'
-------------------------------------
27489   1
27488   1
27487   1
27486   232
27254   232
27022   693
26329   240
26089   401
25688   175
25513   348
25165   864
24301   7656
16645   1709
14936   184
14752   13
14739   541
14198   82
14116   215
13901   68
13833   117
13716   140
....
1468    1
1467    10
1457    1
1456    1
1455    1
1454    9
1445    1
1444    1
1443    1
1442    1
1441    1

Observations:

We use (OpenJPA) use the configuration (@Id, @GeneratedValue(strategy = GenerationType.IDENTITY) for the ID. Id's are not set in the code.
We see the same pattern in other tables as well. We tested the same API with parallel creation of the same with 20 threads but we didn't similar large gaps in the create id's for both tables.
Till id: 1445 (we have count of 1225 rows) in order table, we don't see any issue with bigger jumps. After id: 1445, we started seeing multiple jumps in order table till 27487 for 1-2 days continuously.
Also I understand the rollback of transaction's in API call can increase the auto-increment value, but the multiple bigger jumps compared to total records seems not understandable.

Questions:

Does anyone has any idea what can be the cause for this ?. Why id's are jumping in multiple thousands (e.g., 7656, 1709 in above example) when the total records are below 1367 in order tables ?.
I see this happens in multiple cluster of islands of the ids jumping and starts back to create one by one (like the above example). Has this relate to any other internal handling of mysql data relate processes. ?. Is this a possible bug in mysql ?
What can cause 'mysql' to jump higher like this (for e.g., bulk inserts, can shared locks cause this) ?. How can I simulate this issue locally ?.

Any ideas would be really helpful.

CodePudding user response：

Id "gaps" can occur for several reasons:

If an INSERT fails, for example by conflicting with another constraint like UNIQUE or FOREIGN KEY, the generation of an id will not be undone. The row will not be inserted, but the id is still incremented, so a value is "lost."

I helped a client with such an issue once. They had a UNIQUE constraint in the username in the users table, and some people kept trying to sign up as new users on the website using existing usernames. This happened every day, and kept increasing, so eventually there was about 1500 id's "lost" per day.
If the INSERT succeeds, but the transaction is rolled back by your application for some reason.
If the auto_increment_increment is set to something other than 1. This is an option that can be set temporarily by any session, so you may not see it in your MySQL Server config file. But it would probably be in the application code.
The INSERT specifies a value higher than the next greater id value, overriding the auto-increment. Auto-increment will not generate a value lower than the greatest value in the table, so if an INSERT specifies a higher value, the next INSERT after that will be greater. It won't fill in the gap it left.
Someone ran ALTER TABLE ... AUTO_INCREMENT=... and changed the table's current next id value.
InnoDB bugs. These are rare, but they have been reported that sometimes a table "skips" auto-increment values instead of incrementing strictly by 1. The auto-increment guarantees unique values, but it does not guarantee consecutive values.

CodePudding user response：

Well, for starters, I wouldn't start checking the inserts. They are hardly at fault here. How is the insert done? I would expect that if the API call fails, you'll rollback, else commit, correct? But keep in mind that if a transaction fails, the sequence is not rolled back.

I would search the API call logs for errors. Chances are, if you have max id 1099, then the next insert is 1104, I'd say you'll surely find 4 errors in the log, plus a correct and valid insert.

Keep in mind, sequences are not meant to "look nice". Yes, if all control codes pass, then of course, each record is n 1 and it is "properly" (from an aesthetic point-of-view) inserted. But sequences are not about that. An auto_increment simply makes sure that you will not break a PK or UK constraint. That is all. So for any usage, regardless of a COMMIT or ROLLBACK, a sequence is "used". This is by design and guarantees data integrity.

As for simulating this, you can try a custom API call (or local insert, but if you want to be as close to the prod version as possible, you can use an API call) that always rolls back after inserting. Then checking the current sequence value of the auto_increment.