I have mysql database hosted on one of the websites hosting services companies -Hostinger-, this database used from mobile app by php APIs. There are many tables.
I will show important tables with only the important columns as objects to be easier for understanding:
user(id, username, password, balance, state);
cardsTrans(id, user_id, number, password, price, state);
customersTrans(id, user_id, location, state);
posTrans(id, user_id, number, state);
I thought create one table instead of these three transactions tables, and this table showed like:
allTransaction(id, user_id, target_id, type, card_number, card_pass, location);
I know that there is a redundancy and some columns will get null, and I can normalize this table, but the normalization will produced with many join when query the data and I interested the response time. To explain the main idea: the user can do three types of transactions(each type is with different table), these transactions stored on allTransaction table with user_id as foreign key from users table and target_id as foreign key from other table, determined in depends on the type. the other columns also depends on the type and maybe set to null.
What I want is to determine which better for response time and performance when users using the app. The DML operations(insert , update, delete) applied frequently on these tables, and also very much queries, Usually querying by user_id and target_id.
If I used one table, this table will have very large number of rows and many null values in each row, so slowing the queries and take large storage.
If the table has index, the index will slowing the insert or update operations.
Is creating partition per user on the table without indexes will be better for response time with any operation (select, insert, update, or delete), or creating multiple tables (table per user) is better. the expected number of users is between (500 - 5000).
I searched and found this similar question MySQL performance: multiple tables vs. index on single table and partitions But it doesn't in the same context when I interested in response time and then the performance, also my database is hosted on hosting server and not in the same device with the mobile app.
Who can tell me what is better and why?
CodePudding user response:
As a general rule:
- Worst: Multiple tables
- Better: Builtin
PARTITIONing
- Best: Neither, just better indexing.
If you want to talk specifically about your case, please provide SHOW CREATE TABLE
and the main SELECTs
, DELETEs
, etc.
It is possible to "over-normalize".
three types of transactions(each type is with different table)
That can be tricky. It may be better to have one table for transactions.
"Response time" -- Are you expecting hundreds of writes per second?
take large storage.
Usually proper indexing (especially with 'composite' indexes) makes table size not a performance issue.
partition per user on the table
That is no faster than having an index starting with user_id
.
If the table has index, the index will slowing the insert or update operations.
The burden on writes is much less than the benefit on reads. Do not avoid indexes for that reason.
(I can be less vague if you provide tentative CREATE TABLEs
and SQL statements.)
CodePudding user response:
Instead of trying to predict the future, use the simplest schema that will work for now and be prepared to change it when you learn more by actual use. This means avoid scattering assumptions about the schema around the code. Look into the concept of Schema Migrations to safely change your schema and the Repository Pattern to hide the details of how things are stored. 5000 users is not a lot (unless they will all be using the system at the same time).
For now, go with the design that provides the strongest referencial integrity. That means as many not null
columns as possible. While you're developing the product, you're going to be introducing bugs which might accidentally insert nulls where it should insert a value. Referencial integrity provides another layer of protection.
For example, if you have a single AllTransactions table which might have some fields filled in and might not depending on the type of transaction your schema has to make all these columns nullable. The schema cannot protect you from accidentally inserting a null value.
But if you have individual CardTransactions, CustomerTransactions, and PosTransactions tables their schemas can be constrained to ensure all the necessary fields are always filled in. This will catch many different sorts of bugs.
A variation on this is to have a single UserTransaction table which stores all the generic information about a user transaction (user_id, timestamp) and then join tables for each type of transaction. Here's a sketch.
user_transactions
id bigint primary key auto_increment
user_id integer not null references users on delete casade
-- Fields common to every transaction below
state enum(...) not null
price numeric not null
created_at timestamp not null default current_timestamp()
card_transactions
user_transaction_id bigint not null references user_transactions on delete cascade
card_id integer not null references cards on delete casade
..any other fields for card transactions...
pos_transactions
user_transaction_id bigint not null references user_transactions on delete cascade
pos_id integer not null references pos on delete cascade
..any other fields for POS transactions...
This provides full referential integrity. You can't make a card transaction without a card. You can't make a POS transation without a POS. Any fields required by a card transaction can be set not null
. Any fields required by a POS transaction can be set not null
.
Getting all transactions for a user is a simple indexed query.
select *
from user_transactions
where user_id = ?
And if you only want one type do a left join, also a simple indexed query.
select *
from card_transactions ct
join user_transactions ut on ut.id = ct.user_transaction_id
where ut.user_id = ?