Home > Back-end >  Postgres Query Optimization without adding an extra index
Postgres Query Optimization without adding an extra index

Time:03-25

I was trying to optimize this query differently, but before that, Can we make any slight change in this query to reduce the time without adding any index?

Postgres version: 13.5

Query:

SELECT 
  orders.id as order_id,
  orders.*, u1.name as user_name,
  u2.name as driver_name,
  u3.name as payment_by_name, referrals.name as ref_name, 
  array_to_string(array_agg(orders_payments.payment_type_name), ',') as payment_type_name,
  array_to_string(array_agg(orders_payments.amount), ',') as payment_type_amount,
  array_to_string(array_agg(orders_payments.reference_code), ',') as reference_code,
  array_to_string(array_agg(orders_payments.tips), ',') as tips, 
  array_to_string(array_agg(locations.name), ',') as location_name,
  (select 
    SUM(order_items.tax) as tax from order_items 
    where order_items.order_id = orders.id and order_items.deleted = 'f'
  ) as tax,
  (select 
    SUM(orders_surcharges.surcharge_tax) as surcharge_tax from orders_surcharges
    where orders_surcharges.order_id = orders.id
  ) 
FROM "orders" 
LEFT JOIN 
  users as u1 ON u1.id = orders.user_id 
LEFT JOIN 
  users as u2 ON u2.id = orders.driver_id
LEFT JOIN 
  users as u3 ON u3.id = orders.payment_received_by 
LEFT JOIN
  referrals ON referrals.id = orders.referral_id
INNER JOIN 
  locations ON locations.id = orders.location_id 
LEFT JOIN
 orders_payments ON orders_payments.order_id = orders.id 
WHERE
  (orders.company_id = '626') 
AND 
  (orders.created_at BETWEEN '2021-04-23 20:00:00' AND '2021-07-24 20:00:00') 
AND 
  orders.order_status_id NOT IN (10, 5, 50)
GROUP BY
  orders.id, u1.name, u2.name, u3.name, referrals.name 
ORDER BY 
  created_at ASC LIMIT 300 OFFSET 0

Current Index:

"orders_pkey" PRIMARY KEY, btree (id)
"idx_orders_company_and_location" btree (company_id, location_id)
"idx_orders_created_at" btree (created_at)
"idx_orders_customer_id" btree (customer_id)
"idx_orders_location_id" btree (location_id)
"idx_orders_order_status_id" btree (order_status_id)

Execution Plan

Seems this takes more time on the parallel heap scan.

CodePudding user response:

You're looking for 300 orders and try to get some additional information about these records. I would see if I could first get these 300 records, instead of getting all the data and then limit it to 300. Something like this:

WITH orders_300 AS (
    SELECT  * -- just get the columns that you really need, never use * in production
    FROM    orders
        INNER JOIN locations ON locations.id = orders.location_id 
    WHERE   orders.company_id = '626'
    AND     orders.created_at BETWEEN '2021-04-23 20:00:00' AND '2021-07-24 20:00:00'
    AND     orders.order_status_id NOT IN (10, 5, 50)
    ORDER BY 
      created_at ASC LIMIT 300 -- LIMIT 
    OFFSET 0
)
SELECT 
  orders.id as order_id,
  orders.*, -- just get the columns that you really need, never use * in production
  u1.name as user_name,
  u2.name as driver_name,
  u3.name as payment_by_name, referrals.name as ref_name, 
  array_to_string(array_agg(orders_payments.payment_type_name), ',') as payment_type_name,
  array_to_string(array_agg(orders_payments.amount), ',') as payment_type_amount,
  array_to_string(array_agg(orders_payments.reference_code), ',') as reference_code,
  array_to_string(array_agg(orders_payments.tips), ',') as tips, 
  array_to_string(array_agg(locations.name), ',') as location_name,
  (SELECT   SUM(order_items.tax) as tax 
    FROM    order_items 
    WHERE   order_items.order_id = orders.id 
    AND     order_items.deleted = 'f'
  ) as tax,
  ( SELECT  SUM(orders_surcharges.surcharge_tax) as surcharge_tax 
    FROM    orders_surcharges
    WHERE   orders_surcharges.order_id = orders.id
  ) 
FROM "orders_300" AS orders
    LEFT JOIN users as u1 ON u1.id = orders.user_id 
    LEFT JOIN users as u2 ON u2.id = orders.driver_id
    LEFT JOIN users as u3 ON u3.id = orders.payment_received_by 
    LEFT JOIN referrals ON referrals.id = orders.referral_id
    LEFT JOIN orders_payments ON orders_payments.order_id = orders.id 
GROUP BY
  orders.id, u1.name, u2.name, u3.name, referrals.name 
ORDER BY 
      created_at;
  

This will at least have a huge impact on the slowest part of your query, all these index scans on orders_payments. Every single scan is fast, but the query is doing 165000 of them... Limit this to just 300 and will be much faster.

Another issue is that none of your indexes covers the entire WHERE condition on the table "orders". But if you can't create a new index, you're out of luck.

  • Related