optimize given sql query for speed-CodePudding

I got this query from a fellow programmer, but it executes extremely slowly. it takes about 4 seconds to run.

Can this query be optimized to give the same result, but perform better?

SELECT checkedcrates, 
   pv.name as powervision, 
   cg.name as cgrp, 
   cs.name as csz, 
   c.name as cname
   FROM  public.inspectionresultsstatistic e, 
        crates c, 
        powervisions pv, 
        lines l, 
        quality q, 
        cratesgroupscrates cgc, 
        cratesgroups cg, 
        cratessizes cs 
   where
        c.id = e.crateid 
        and l.id = e.lineid 
        and pv.id = l.powervisionid
        and q.id = e.qualityid 
        and c.id = cgc.crateid 
        and cs.id = cgc.cratesizeid 
        and cg.id = cgc.crategroupid 
        and qualityid = 0
        and pv.name in ('PV101')  
        and c.name in ('24603','104','136','154','186','106','156','216','246','206')
        and cg.name in ('Black','Blue','DLL','Green')
        and cs.name in ('30x40','60x40')
        and to_timestamp(e.startts) >= '2021-10-18T17:45:22Z' 
        and to_timestamp(e.stopts-1) <= '2021-10-18T19:45:22Z'
   group by 
     powervision, 
     cgrp, 
     csz, 
     cname, 
     checkedcrates, 
     startts

EDIT: actually just noticed that it is the inner select being slow... updated the query above by removing the outer query

EDIT2: maybe I should add some indexes? I have a index for every table where it is connected to the other (so all the ID columns) and have for the inspectionresultsstatistic a index on id qualityid startts stopts

EDIT3: as per request I try to give more informations about my tables and the data.

I am using PostgreSql 12,

the table structures are as follows:

http://sqlfiddle.com/#!17/bb5a6/1

all tables are rather small with less than 50 entries, except for inspectionresultsstatistics that contains about 12.000.000 rows.

CodePudding user response：

My big tell with this is that your joins are not written with more updated syntax. When you use commas between tables in your FROM clause, it creates a 'Cartesian Product' of both tables. This is much more resource intensive than using 'INNER JOIN' followed by an 'ON' clause. The ON clause should be used instead of a constraint in the WHERE clause.

Somthing like,

   SELECT checkedcrates, 
    pv.name as powervision, 
    cg.name as cgrp, 
    cs.name as csz, 
    c.name as cname
FROM  public.inspectionresultsstatistic e
    INNER JOIN crates c
        ON c.id = e.crateid 
    INNER JOIN lines l
        ON l.id = e.lineid 
    INNER JOIN powervisions pv
        ON pv.id = l.powervisionid
    INNER JOIN quality q
        ON q.id = e.qualityid
    INNER JOIN cratesgroupscrates cgc
        ON c.id = cgc.crateid
    INNER JOIN cratesgroups cg
        ON cg.id = cgc.crategroupid
    INNER JOIN cratessizes cs 
        ON cs.id = cgc.cratesizeid
where qualityid = 0
    and pv.name in ('PV101')  
    and c.name in ('24603','104','136','154','186','106','156','216','246','206')
    and cg.name in ('Black','Blue','DLL','Green')
    and cs.name in ('30x40','60x40')
    and to_timestamp(e.startts) >= '2021-10-18T17:45:22Z' 
    and to_timestamp(e.stopts-1) <= '2021-10-18T19:45:22Z'
group by 
    powervision, 
    cgrp, 
    csz, 
    cname, 
    checkedcrates, 
    startts

CodePudding user response：

Your fiddle doesn't include the indexes. You should use db-fiddle.com rather than sqlfiddle as the latter is essentially broken, having nothing newer than 9.6 to offer.

The time range being queried is pretty small, so presumably is very selective.

You would be better querying that as a range:

and tstzrange(to_timestamp(e.startts),to_timestamp(e.stopts-1),'[]') <@ 
    tstzrange('2021-10-18T17:45:22Z','2021-10-18T19:45:22Z','[]')

Which would benefit from the expression index

on inspectionresultsstatistic using gist (tstzrange(to_timestamp(startts),to_timestamp(stopts-1),'[]'))

But really, probably better off to store your timestamps as timestamptz rather than int8, or maybe even store them as tstzrange directly.