Home > Enterprise >  What is the most optimised between creating a new table or insert into an old table?
What is the most optimised between creating a new table or insert into an old table?

Time:04-12

So I would like to create a database to store a lot of information (100mb every year I'll say). I know that every year I will add a lot of rows of information in the database.

so I am asking if the most optimize thing is to insert all thoses lines into the same table and add the column year at the end to "archive" the data (because I need to save the year). exemple of what it would look like :

info 1  | info 2   | info 3   | info 4   | 2021
info 5  | info 6   | info 7   | info 8   | 2021
info 9  | info 10  | info 11  | info 12  | 2021
.
.
info 1000 | info 1001 | info 1002 | info 1003 | 2022
info 1004 | info 1005 | info 1005 | info 1006 | 2022

OR is it to create each year a new table with the rows of the year (so take of the column of the year), it would be something like this :here is an exemple of each year a new table

So for you what is the most optimized?

CodePudding user response:

Are you sure it is 100mb every year , and not a typo for 100gb (which is still relatively small, but ok) or 1 tb? For 100mb each year you could go with whichever solution you prefer, the plans have no measurable differences whatsoever.

CodePudding user response:

When dealing with optimization, your design choices about how to store data should always be thought according to the kind of operations you need to carry out.

Your operations need to be distinguished between:

  • interactive, carried out in real-time (e.g. creation of an order by a user)
  • batch, carried out once in a while (e.g. creation of weekly/monthly/yearly reports about orders)

Your decisions about optimization should give higher weight to interactive operations as long as they're the most frequent ones.

Now let's get to your example:

  1. if you have no interactive operations that use data across different years, then the second design is better, as long as you'll always deal with just small pieces of the whole data, and in the worst cases you will JOIN the different tables once in a while.
  2. if you have some interactive operations that use data across different years, the first design is better, because you would avoid the JOIN operation in an interactive (real-time) way.

Side note: the JOIN operations are the most expensive operations in database manipulation, that's why you want to avoid them if you can.

  • Related