I have problem with slow queries.
PS
MariaDB: mariadb:10.3.25 - InnoDB
I optimized most DB configurations
Structure
create table customers
(
id bigint unsigned auto_increment
primary key,
email varchar(255) null,
full_name varchar(255) null,
country varchar(2) null,
first_name varchar(255) null,
second_name varchar(255) null,
company_name varchar(255) null,
gender char null,
birth_date date null,
state varchar(3) null,
null,
custom_field_1 varchar(255) null,
custom_field_2 varchar(255) null,
custom_field_3 varchar(255) null,
created_at timestamp null,
updated_at timestamp null,
deleted_at timestamp null
)
collate = utf8mb4_unicode_ci;
create table customer_daily_stats
(
date date not null,
campaign_id bigint not null,
customer_id bigint not null,
event_1 int unsigned default 0 not null,
event_2 int unsigned default 0 not null,
event_3 int unsigned default 0 not null,
event_4 int unsigned default 0 not null,
event_5 int unsigned default 0 not null,
constraint customer_daily_stats_date_customer_id_campaign_id_unique
unique (date, customer_id, campaign_id)
)
collate = utf8mb4_unicode_ci;
create index customer_daily_stats_customer_id_date_index
on customer_daily_stats (customer_id, date);
create index customer_daily_stats_campaign_id_index
on customer_daily_stats (campaign_id);
customers ~ 1 - 5 millions rows
customer_daily_stats ~ 1 - 100 millions rows
Queries
select
customers.*,
IFNULL(
SUM(events_aggregation.event_1),
0
) as event_1,
IFNULL(
SUM(events_aggregation.event_2),
0
) as event_2,
IFNULL(
SUM(events_aggregation.event_3),
0
) as event_3,
IFNULL(
SUM(events_aggregation.event_4),
0
) as event_4
from
`customers`
left join customer_daily_stats as events_aggregation on `customers`.`id` = `events_aggregation`.`customer_id`
and `events_aggregation`.`date` between '2021-09-06' and '2022-07-06'
group by
`customers`.`id`;
Problems
Main idea is to have possibility to get aggregation by any dates.
Problem is that works too slow now and i need to do addition aggregations which decrease performance. One more problem i don't have a lot of disc space (250G and about 80% used already).
I have:
- customers ~ 1.5m
- customer_daily_stats ~ 50.000
- query speed ~ 5s
Questions
- Is there any methods to optimize my DB or another tools?
- Is there any DBs that help my to increase performance?
CodePudding user response:
Change the indexes. You currently have
unique (date, customer_id, campaign_id)
INDEX(customer_id, date)
INDEX(campaign_id)
Maybe Change to:
PRIMARY KEY(customer_id, date, campaign_id)
INDEX(campaign_id)
BUT... And this is a big BUT. This rearrangement of indexing may significantly hurt other queries. We really need to see
- All the big queries
EXPLAIN SELECT
for each
Did you notice that the range is 10 months plus 1 day? This is because BETWEEN
is 'inclusive'.
If 80% of disk is already used, you are in deep weeds. Any fixes will require more than 20% of the disk to achieve.
One thing to do (when you have enough disk space) is to shrink BIGINT
(8 bytes, probably an excessive range) and INT UNSIGNED
(4 bytes, 4 billion max) to smaller int types where practical.
I'm confused. These seem to contradict each other; please clarify:
customer_daily_stats ~ 1 - 100 millions rows
customer_daily_stats ~ 50.000
Some more things to help with the analysis:
innodb_buffer_pool_size
RAM size
disk footprint for tables (GB)