Home > front end >  What I have done wrong with SELECT...JOIN?
What I have done wrong with SELECT...JOIN?

Time:09-06

I have having to tables in my database. Table_A contains statistics of a trajectory segments (segment_id, start/end coordinates, length).

CREATE TABLE table_A(segment_id INT NOT NULL, session_id INT NOT NULL,
segment_start_sec INT NOT NULL, lat_start DOUBLE PRECISION, lon_start DOUBLE PRECISION, 
lat_end DOUBLE PRECISION, lon_end DOUBLE PRECISION, length DOUBLE PRECISION)

Table_B contains row GPSlogs for all segments at 1 second sampling rate.

CREATE TABLE table_B(session_id INT NOT NULL,seconds INT NOT NULL, 
   lat DOUBLE PRECISION, lon DOUBLE PRECISION)

Now I want select the GPS logs of each segment, from its start to finish. So I do the query:

 SELECT a.segment_id,
    b.seconds,
    b.lat,
    b.lon
 FROM table_A a
     LEFT JOIN table_B b 
        ON a.session_id = b.session_id

The Problem:

This returns everything from start to finish of the longest segment. So:

-- check segment 0 total records
SELECT COUNT(*)
 FROM table_A a
     LEFT JOIN table_B b 
        ON a.session_id = b.session_id
WHERE a.segment_id=0

289

And

-- check segment 12 total records
SELECT COUNT(*)
 FROM table_A a
     LEFT JOIN table_B b 
        ON a.session_id = b.session_id
WHERE a.segment_id=12

289

-- same as segment_id=16

Apparently this is the GPS logs of the longest segment, but all the segments ended having same number of rows.

How do I filter the GPS logs of each segment, depending on its starting and finishing time in this case?

I reproduce these tables (complete) in this dbfiddle including all the queries stated above.

CodePudding user response:

So you want to get all the GPS for the trajectory segments one by one.

Since table_A does not store the seconds for the end of a segment, you can use the LEAD function to get it from the segment_start_sec of the next segment. But now you have a problem with the last segment, as it has no another segment to get the end seconds value from, so its end seconds value will be equal to the maximum value of seconds in table_B for the same session_id.

I should note, that in this solution the segments have "common" rows for each pair, this is the end of the current one and the beginning of the next one.

Finally query can look like this

SELECT a.segment_id,
    b.seconds,
    b.lat,
    b.lon
FROM (
    SELECT 
        *,
        LEAD(segment_start_sec) OVER (PARTITION BY session_id ORDER BY segment_id) segment_end_sec
    FROM table_A         
) a
LEFT JOIN table_B b 
    ON a.session_id = b.session_id
       AND b.seconds BETWEEN a.segment_start_sec AND COALESCE(a.segment_end_sec, 
                                                              (SELECT MAX(seconds) 
                                                               FROM table_B b1 
                                                               WHERE b1.session_id = b.session_id))

It returns 291 row overall for your sample data.

db<>fiddle here

  • Related