State description
I have two databases, DB1
and DB2
, that have the same table, Author
, with the fields Author.ID
and Author.AuthorName
.
The DB1.Author
has the AUTO_INCREMENT on its Author.ID
field, while the DB2.Author
does not have the AUTO_INCREMENT since it relies on the correctness of DB1
data.
Both tables have the PRIMARY index on Author.ID
and a UNIQUE index on Author.AuthorName
.
DB2.Author
has rows copied from the DB1.Author
.
Both databases use MariaDB version 10.6.7.
The problem
DB1
manager deleted some entries in the DB1.Author
table, and then reordered indexes to have no gaps in index numbers. This means they might have had:
ID | AuthorName |
---|---|
1 | A |
2 | B |
3 | C |
Then they deleted the row where the AuthorName
was 'B'
:
ID | AuthorName |
---|---|
1 | A |
3 | C |
And they finally updated the indexes to have no gaps (3-C changed to 2-C):
ID | AuthorName |
---|---|
1 | A |
2 | C |
Now I need to find a way to copy such updated state of the rows from the DB1.Author
to the DB2.Author
without deleting everything from the DB2.Author
table, so that I don't lose the data on CASCADE
effects.
What is the best approach for this?
My shot
This is what I did, but it obviously cannot work, since in the case of duplicate key, it would attempt to create another duplicate key (duplicate ID
2 would try to INSERT duplicate value of 'C'
, since it already exists on ID
3):
INSERT INTO DB2.Author (ID, AuthorName)
SELECT DB1.Author.ID, DB1.Author.AuthorName FROM DB1.Author
ON DUPLICATE KEY UPDATE
ID = DB1.Author.ID,
AuthorName = DB1.Author.AuthorName;
Additional ways?
Other than the possible SQL query solution, are there any other ways to automatically update the table data in one database when the other database changes its data? Would need to replicate only some tables, while other, linked tables are different.
CodePudding user response:
You need to treat the DB2 table as completely wrong as the update of primary keys on the source table would have completely spoilt it.
- Delete everything in DB2 table
- Insert into DB2 table everything from DB1 table
Going forwards, without being condescending, the users with access to DB1 need training (or perhaps you need to reconsider the security against the DB). Updating a primary key value is a wrong thing to do. Gapless sequences is a silly thing to want, especially when you have known dependencies. In fact, gapless sequences is often listed as poor database security (as it makes it easy to just cycle through all data).
You probably want to consider commercial solutions for logical data replication. If they don’t support updates of primary keys then you can use that as a good enough reason not to.
I would invest time in making sure there’s no other logical corruptions of data like this.
CodePudding user response:
tl;dr your problem is your DB manager. The solution is to get him/her to undo the damage they caused by restoring the data to how it was before. Deleting rows is fine. Updating primary keys is never OK.
Do not create a work around or validate it by accommodating his/her mistake, because doing so will make it more likely that it will happen again.
Full answer.
Your actual problem is your "DB manager", who violated a fundamental rule of databases: Never update surrogate key values!
In your case it's even more tragic, because gaps in the ID column values don't matter in any way. If gaps do matter, you're in even worse shape. Allow me to explain...
The author's name is your actual identifier. We know this because there a unique constraint on it.
The ID column is a surrogate key, which are most conveniently implemented as an auto incrementing integer, but surrogate keys would work just as well if they were random (unique) numbers. Gaps, and even the choice of values themselves, are irrelevant to the effectiveness of surrogate keys.