I have a query. It has an index but I'm not sure why it's taking time for fewer rows. Can anyone please suggest to improving that query performance?
SELECT
up.id post_id,
c.id community_id,
SUBSTRING(up.description, 1, 30) post_description,
c.name community_name,
c.is_index is_index,
c.slug slug,
p.last_modified_on last_modified_on,
c.is_shop is_shop
FROM
user_post up
INNER JOIN community c ON up.community_id = c.id
and c.is_index = 1
inner join participant pa on c.participant_id = pa.id
and pa.is_active = 1
inner join participating_entity p on up.participating_entity_id = p.id
left join indexed_post ip on ip.user_post_id = up.id
and ip.is_active = 1
WHERE
p.is_active = 1
and up.is_spam_post = 0
and ip.id is null
and LENGTH(up.description) >= 20
order by
up.id
limit
358223,
5000\G
row scans:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: c
partitions: NULL
type: ref
possible_keys: PRIMARY,fk_community_participant1_idx,idx_is_index_participant_id
key: idx_is_index_participant_id
key_len: 2
ref: const
rows: 6702
filtered: 100.00
Extra: Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: pa
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: she.c.participant_id
rows: 1
filtered: 10.00
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: up
partitions: NULL
type: ref
possible_keys: fk_user_post_community1_idx,fk_user_post_entity1_idx,is_spam_post,idx_is_spam_post,idx_community_id_participating_entity_id_is_spam_post,idx_participating_entity_id_community_id
key: idx_community_id_participating_entity_id_is_spam_post
key_len: 9
ref: she.c.id
rows: 338
filtered: 50.00
Extra: Using index condition; Using where
*************************** 4. row ***************************
id: 1
select_type: SIMPLE
table: ip
partitions: NULL
type: ref
possible_keys: idx_user_post_id
key: idx_user_post_id
key_len: 8
ref: she.up.id
rows: 1
filtered: 10.00
Extra: Using where; Not exists
*************************** 5. row ***************************
id: 1
select_type: SIMPLE
table: p
partitions: NULL
type: eq_ref
possible_keys: PRIMARY,idx_is_active
key: PRIMARY
key_len: 8
ref: she.up.participating_entity_id
rows: 1
filtered: 50.00
Extra: Using where
It takes more than 15 seconds. Can anyone please suggest what is the problem? Can't it be solved by rewriting the query to increase the performance?
data output:
*************************** 5000. row ***************************
post_id: 1788267
community_id: 492
post_description: #none #FoodParty
community_name: Cooking, Food, Recipes and More...
is_index: 1
slug: cooking-food-recipes-and-more
last_modified_on: 2020-03-09 17:48:50
is_shop: 0
5000 rows in set (2 min 13.05 sec)
Table structure;
Table : Community
PRIMARY KEY (`id`),
KEY `fk_community_participant1_idx` (`participant_id`),
KEY `fk_community_community_type1_idx` (`community_type_id`),
KEY `idx_parent_participant_id` (`parent_participant_id`),
KEY `idx_score_latest` (`score_latest`),
KEY `slug` (`slug`(191)),
KEY `idx_is_index_participant_id` (`is_index`,`participant_id`),
CONSTRAINT `_fk_community_community_type1` FOREIGN KEY (`community_type_id`) REFERENCES `community_type` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `_fk_community_participant1` FOREIGN KEY (`participant_id`) REFERENCES `participant` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
Table : participant
*************************** 1. row ***************************
Table: participant
Create Table: CREATE TABLE `participant` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`participant_type_id` bigint(20) NOT NULL,
`crdt` timestamp NULL DEFAULT NULL,
`created_by` bigint(20) DEFAULT NULL,
`last_modified_on` timestamp NULL DEFAULT NULL,
`last_modified_by` bigint(20) DEFAULT NULL,
`is_active` tinyint(1) DEFAULT '1',
`is_deleted` tinyint(1) DEFAULT '0',
`partner_id` bigint(20) DEFAULT NULL,
`sheroes_deep_link` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_participant_participation_type1_idx` (`participant_type_id`),
KEY `idx_participant_last_modified_on` (`last_modified_on`),
KEY `idx_participant_crdt` (`crdt`),
KEY `sheroes_deep_link` (`sheroes_deep_link`(191)),
CONSTRAINT `fk_participant_participation_type1` FOREIGN KEY (`participant_type_id`) REFERENCES `participant_type` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
Table: participating_entity
PRIMARY KEY (`id`),
KEY `fk_entity_entity_type_idx` (`entity_type_id`),
KEY `idx_participating_entity_last_modified_on` (`last_modified_on`),
KEY `created_by` (`created_by`),
KEY `fk_partner_id_idx` (`partner_id`),
KEY `crdt` (`crdt`),
KEY `idx_category` (`category`),
KEY `idx_created_by` (`created_by`),
KEY `idx_is_active` (`is_active`),
CONSTRAINT `fk_entity_entity_type` FOREIGN KEY (`entity_type_id`) REFERENCES `entity_type` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `fk_partner_id` FOREIGN KEY (`partner_id`) REFERENCES `api_consumer_partner` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
Table : indexed_post
*************************** 1. row ***************************
Table: indexed_post
Create Table: CREATE TABLE `indexed_post` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_post_id` bigint(20) NOT NULL,
`is_active` tinyint(1) DEFAULT '0',
`crdt` timestamp NULL DEFAULT NULL,
`created_by` bigint(20) DEFAULT NULL,
`last_modified_by` bigint(20) DEFAULT NULL,
`last_modified_on` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_user_post_id` (`user_post_id`)
Table : user_post
PRIMARY KEY (`id`),
KEY `fk_user_post_users1_idx` (`users_id`),
KEY `fk_user_post_community1_idx` (`community_id`),
KEY `fk_user_post_entity1_idx` (`participating_entity_id`),
KEY `idx_source_ent_id` (`source_entity_id`),
KEY `idx_entity_start_dt` (`entity_start_date`),
KEY `idx_rating_for_company` (`rating`),
KEY `user_post_source_type` (`source_type`),
KEY `is_spam_post` (`is_spam_post`),
KEY `idx_is_spam_post` (`is_spam_post`),
KEY `idx_is_theme_post` (`is_theme_post`),
KEY `idx_meta_title` (`meta_title`),
KEY `idx_meta_description` (`meta_description`),
KEY `idx_recipient_id` (`recipient_id`),
KEY `idx_title` (`title`(191)),
KEY `idx_community_id_participating_entity_id_is_spam_post` (`community_id`,`participating_entity_id`,`is_spam_post`),
KEY `user_post_is_recommended_IDX` (`is_recommended`),
KEY `idx_participating_entity_id_community_id` (`participating_entity_id`,`community_id`),
CONSTRAINT `__fk_user_post_community1` FOREIGN KEY (`community_id`) REFERENCES `community` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `__fk_user_post_entity1` FOREIGN KEY (`participating_entity_id`) REFERENCES `participating_entity` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `__fk_user_post_users1` FOREIGN KEY (`users_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
CodePudding user response:
Nothing can fix
limit 358223, 5000
That query must join all those tables, filter out the unnecessary rows, sort them, skip over 358223 rows and finally deliver 5000.
If you are dumping a million rows, either
- Dump them all at once, so you are not rebuilding the set over and over.
- "Remember where you left off" so you don't have to perform
OFFSET
. However, because of all the complexity in the query, I doubt if this is possible.
So, in my opinion, the real cure is to rethink why you need this query.
Note that each table needs to test for is_active
. This effective prevents some optimizations.
CodePudding user response:
Could you try the following?
SELECT
up.id post_id,
c.id community_id,
SUBSTRING(up.description, 1, 30) post_description,
c.name community_name,
c.is_index is_index,
c.slug slug,
p.last_modified_on last_modified_on,
c.is_shop is_shop
FROM
user_post up
INNER JOIN community c ON up.community_id = c.id AND up.is_spam_post = 0 and c.is_index = 1 AND LENGTH(up.description) >= 20
inner join participant pa on c.participant_id = pa.id and pa.is_active = 1
inner join participating_entity p on up.participating_entity_id = p.id AND p.is_active = 1
left join indexed_post ip on ip.user_post_id = up.id and ip.is_active = 1
WHERE
ip.id is null
order by
up.id
limit
358223,
5000\G