Home > Mobile >  calculate different trips over two subsequent month in a SQL query
calculate different trips over two subsequent month in a SQL query

Time:12-30

I have a simple table that contains trips over different dates.

trip_id start_date end_date
160320 2017-12-31 20:40:25 UTC 2017-12-31 20:45:25 UTC
160321 2018-01-12 21:01:51 UTC 2018-01-12 22:01:51 UTC

I simply want to create a SQL query that shows these fields.

  • year
  • month
  • trips_this_month,
  • trips_previous_month
  • difference_from_previous_month (count_this_month - count_previous_month)
  • is_increased (is a boolean column that is true if we saw an increase, false otherwise) Update: I could wrap up my head and write a simple query to obtain them, but I still feel I can optimize this query. Any help will be appreciated.
    SELECT
    year,
    month,
    trips_this_month,
    trips_previous_month,
    case when difference_from_previous_month < 0 then false else true end as is_increased
FROM    
(SELECT    
    year,
    month,
    number_of_trips AS trips_this_month,
    LAG(number_of_trips,1,0) over (order by year,month) AS trips_previous_month,    
    number_of_trips - LAG(number_of_trips,1,0) OVER(order by year,month) AS difference_from_previous_month,    
    FROM(
        SELECT EXTRACT(Month FROM start_date) AS month,
               EXTRACT(Year FROM start_date) AS year,
               COUNT(*) as number_of_trips
               FROM a_table
    group by  month ,year
    )
    order by year, month
    limit 100
)
    

But I could not help myself to do more. I appreciate with further helps to complete it.

CodePudding user response:

Try this:

WITH RECURSIVE cte AS (
SELECT MIN(start_date) minstdt, MAX(start_date) maxstdt FROM mytable
UNION ALL
SELECT minstdt INTERVAL 1 MONTH, maxstdt FROM cte 
       WHERE minstdt INTERVAL 1 MONTH <= maxstdt )
       
SELECT year,
       month,
       number_of_trips,
       number_of_trips-IFNULL(prev_month_number_of_trips,0) AS This_month_vs_prev_month,
       IF(number_of_trips > prev_month_number_of_trips,1,0) AS Is_increased
FROM
(SELECT
     YEAR(cte.minstdt) AS year,
     MONTH(cte.minstdt) AS month,
     SUM(CASE WHEN start_date IS NULL THEN 0 ELSE 1 END) AS number_of_trips,
     LAG(SUM(CASE WHEN start_date IS NULL THEN 0 ELSE 1 END)) 
         OVER (ORDER BY YEAR(cte.minstdt), MONTH(cte.minstdt)) AS prev_month_number_of_trips
    FROM cte 
LEFT JOIN mytable 
 ON YEAR(cte.minstdt)=YEAR(start_date) 
  AND MONTH(cte.minstdt)=MONTH(start_date) 
GROUP BY year, month) V
ORDER BY year, month;
  1. I've used recursive common table expression (cte) to generate date based on the minimum and maximum date appeared in start_date of the table.
  2. I've replaced EXTRACT() with YEAR() and MONTH() functions to make is slightly shorter.
  3. I LEFT JOIN the cte with the data table.

Demo fiddle

See if you can work with this.

CodePudding user response:

Consider a self-join to compare current and previous months aggregations using a normalized first-day of month date:

WITH sub AS (
    SELECT 
       DATE_SUB(
           DATE_ADD(LAST_DAY(start_date), INTERVAL 1 DAY), 
           INTERVAL 1 MONTH
       ) AS month_year,
       COUNT(*) AS number_of_trips
    FROM a_table
    GROUP BY month_year
), calc AS (
    SELECT
       YEAR(curr.month_year) AS year,
       MONTH(curr.month_year) AS month,
       COALESCE(curr.number_of_trips, 0) AS trips_this_month,
       COALESCE(prev.number_of_trips, 0) AS trips_previous_month
    FROM sub AS curr
    LEFT JOIN sub AS prev
        ON prev.month_year = DATE_SUB(curr.month_year, INTERVAL 1 MONTH) 
)

SELECT
    year,
    month,
    trips_this_month,
    trips_previous_month,
    trips_this_month - trips_previous_month AS difference_from_previous_month,
    (trips_this_month - trips_previous_month) > 0 AS is_increased
FROM calc
  • Related