Home > Back-end >  How to improve execution time of a Laravel Query Builder generated SQL query
How to improve execution time of a Laravel Query Builder generated SQL query

Time:12-17

I have three tables that are concerned by this query

CREATE TABLE `tags` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `latName` varchar(191) NOT NULL,
  `araName` varchar(191) NOT NULL,
  `active` tinyint(1) NOT NULL DEFAULT 0,
  `img_name` varchar(191) DEFAULT NULL,
  `icon` varchar(191) DEFAULT NULL,
  `rgba_color` varchar(191) DEFAULT NULL,
  `color` varchar(191) DEFAULT NULL,
  `overlay` varchar(191) DEFAULT NULL,
  `position` int(11) NOT NULL,
  `mdi_icon` varchar(191) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `tags_latname_unique` (`latName`),
  UNIQUE KEY `tags_araname_unique` (`araName`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb3
 CREATE TABLE `newspapers` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `latName` varchar(191) NOT NULL,
  `araName` varchar(191) NOT NULL,
  `img_name` varchar(191) DEFAULT NULL,
  `active` tinyint(1) NOT NULL DEFAULT 0,
  PRIMARY KEY (`id`),
  UNIQUE KEY `newspapers_latname_unique` (`latName`),
  UNIQUE KEY `newspapers_araname_unique` (`araName`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8mb3
CREATE TABLE `articles` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `newspaper_id` bigint(20) unsigned NOT NULL,
  `tag_id` bigint(20) unsigned NOT NULL,
  `seen` int(10) unsigned NOT NULL,
  `link` varchar(1000) NOT NULL,
  `title` varchar(191) NOT NULL,
  `img_name` varchar(191) NOT NULL,
  `date` datetime NOT NULL,
  `paragraph` text NOT NULL,
  `read_time` int(11) DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `articles_link_unique` (`link`),
  UNIQUE KEY `articles_img_name_unique` (`img_name`),
  KEY `articles_newspaper_id_foreign` (`newspaper_id`),
  KEY `articles_tag_id_foreign` (`tag_id`),
  CONSTRAINT `articles_newspaper_id_foreign` FOREIGN KEY (`newspaper_id`) REFERENCES `newspapers` (`id`),
  CONSTRAINT `articles_tag_id_foreign` FOREIGN KEY (`tag_id`) REFERENCES `tags` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=47421 DEFAULT CHARSET=utf8mb3

Basically, I want to load the latest 5 articles (ordered by date) that have an active newspaper and active tag.

Right now articles table contains about 40k entries.

This is the query generated by Laravel's query builder

SELECT `articles`.*
FROM `articles`
INNER JOIN `tags` ON `tags`.`id` = `articles`.`tag_id`
  AND `tags`.`active` = 1
INNER JOIN `newspapers` ON `newspapers`.`id` = `articles`.`newspaper_id`
  AND `newspapers`.`active` = 1
ORDER BY `date` DESC
LIMIT 5;

It takes Mysql about 6sec to run the query, when I remove the ORDER BY clause, the query becomes very fast (0.001sec).

Here is the query explanation:

 ------ ------------- ------------ -------- ------------------------------------------------------- ------------------------------- --------- ------------------------ ------ ---------------------------------------------- 
| id   | select_type | table      | type   | possible_keys                                         | key                           | key_len | ref                    | rows | Extra                                        |
 ------ ------------- ------------ -------- ------------------------------------------------------- ------------------------------- --------- ------------------------ ------ ---------------------------------------------- 
|    1 | SIMPLE      | newspapers | ALL    | PRIMARY                                               | NULL                          | NULL    | NULL                   | 18   | Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | articles   | ref    | articles_newspaper_id_foreign,articles_tag_id_foreign | articles_newspaper_id_foreign | 8       | mouhim.newspapers.id   | 1127 |                                              |
|    1 | SIMPLE      | tags       | eq_ref | PRIMARY                                               | PRIMARY                       | 8       | mouhim.articles.tag_id | 1    | Using where                                  |
 ------ ------------- ------------ -------- ------------------------------------------------------- ------------------------------- --------- ------------------------ ------ ---------------------------------------------- 

I tried creating an index on the date attribute but it didn't help.

for convenience, this is how I am using Query Builder for this query:

Article::select("articles.*")
    ->join("tags", function ($join) {
        $join->on("tags.id", "articles.tag_id")
            ->where("tags.active", 1);
    })
    ->join("newspapers", function ($join) {
        $join->on("newspapers.id", "articles.newspaper_id")
            ->where("newspapers.active", 1);
    })
        ->orderBy("date", "desc")
        ->paginate(5)

At first, I was using Eloquent (whereHas) but Eloquent was generating non optimized query using (where exists), so I had to go the joins way.

What can I do to improve execution time of this query?

Result of SHOW INDEXES FROM articles;

 ---------- ------------ ------------------------------- -------------- -------------- ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- 
| Table    | Non_unique | Key_name                      | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
 ---------- ------------ ------------------------------- -------------- -------------- ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- 
| articles |          0 | PRIMARY                       |            1 | id           | A         |       36072 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| articles |          0 | articles_link_unique          |            1 | link         | A         |       36072 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| articles |          0 | articles_img_name_unique      |            1 | img_name     | A         |       36072 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| articles |          1 | articles_newspaper_id_foreign |            1 | newspaper_id | A         |          32 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| articles |          1 | articles_tag_id_foreign       |            1 | tag_id       | A         |          12 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| articles |          1 | data                          |            1 | date         | A         |       36072 |     NULL | NULL   |      | BTREE      |         |               | NO      |
 ---------- ------------ ------------------------------- -------------- -------------- ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- 

This query was suggested by Rick James as a solution

SELECT  `articles`.*
FROM  `articles`     
WHERE EXISTS ( SELECT 1 FROM tags WHERE id = `articles`.`tag_id` and active = 1)
AND EXISTS ( SELECT 1 FROM newspapers WHERE id = `articles`.`newspaper_id` and active = 1)     
ORDER BY `date` DESC
LIMIT  5;

Running EXPLAIN on this query yields the following result

 ------ ------------- ------------ -------- ------------------------------------------------------- ------------------------------- --------- ------------------------ ------ ---------------------------------------------- 
| id   | select_type | table      | type   | possible_keys                                         | key                           | key_len | ref                    | rows | Extra                                        |
 ------ ------------- ------------ -------- ------------------------------------------------------- ------------------------------- --------- ------------------------ ------ ---------------------------------------------- 
|    1 | PRIMARY     | newspapers | ALL    | PRIMARY                                               | NULL                          | NULL    | NULL                   | 18   | Using where; Using temporary; Using filesort |
|    1 | PRIMARY     | articles   | ref    | articles_newspaper_id_foreign,articles_tag_id_foreign | articles_newspaper_id_foreign | 8       | mouhim.newspapers.id   | 1127 |                                              |
|    1 | PRIMARY     | tags       | eq_ref | PRIMARY                                               | PRIMARY                       | 8       | mouhim.articles.tag_id | 1    | Using where                                  |
 ------ ------------- ------------ -------- ------------------------------------------------------- ------------------------------- --------- ------------------------ ------ ---------------------------------------------- 

CodePudding user response:

Assuming you don't want dups, change to this; it is likely to be much faster:

SELECT  `articles`.*
    FROM  `articles`
    WHERE EXISTS ( SELECT 1 FROM tags
                        WHERE id = `articles`.`tag_id` )
      AND EXISTS ( SELECT 1 FROM newspapers
                        WHERE id = `articles`.`newspaper_id` )
    ORDER BY  `date` DESC
    LIMIT  5;

Also, have this index on articles:

INDEX(date)

(This is a rare use case for starting index with a column that will be used in a 'range'.)

(Sorry, I don't speak 'Laravel'; maybe someone else can help with that part.)

PS. Having 3 UNIQUE keys on a table is highly unusual. It often indicates a problem with the schema design.

each article has one and only one Tag associated with it

Can multiple articles have the same Tag?

when I remove the ORDER BY clause, the query becomes very fast (0.001sec).

That is because you get whatever 5 rows are easy to return to you. Clearly the ORDER BY is part of the requirement. "Using temporary; Using filesort" says there was at least a sort. It will actually be a "file" sort -- because SELECT * includes a TEXT column. (There is a technique to avoid "file", but I don't think it is needed here.)

CodePudding user response:

I am not sure if the two queries are supposed to be same, but they are not.

Anyway for the second query I think this should be better

Article::leftJoin('tags', 'articles.tag_id', '=', 'tags.id)
   ->where('tags.latName', $tag)
   ->orderBy("articles.date", "desc")
   ->select(['articles.*'])
   ->paginate(5);

The problem is probably, that the subquery you created in whereIn is slowing it down and whereIn itself may as well slow your query. This may be eased by using join and where.

As for the first query, can you show how you did the index for date? :)

  • Related