Home > Net >  How to calculate total revenue within 3 months in postgresql
How to calculate total revenue within 3 months in postgresql

Time:12-14

Thanks in advance for any help.

I have a table with unique tickets, customer IDs and ticket price. For each ticket, I want to see the number of tickets and total revenue from a customer 3 months after the date of the ticket.

I tried to use the partition by function with the date condition set in the on clause, but it just evaluates all tickets of the customer rather than the 3 month period I want.

select distinct on (at2.ticket_number)
at2.customer_id 
,at2.ticket_id 
,at2.ticket_number 
,at2.initial_sale_date 
,ata.tix "a_tix"
,ata.aov "a_aov"
,ata.rev "a_rev"

from reports.agg_tickets at2 
left join (select at2.customer_id, at2.final_fare_value, at2.initial_sale_date, count(at2.customer_id)  OVER (PARTITION BY at2.customer_id) AS tix,
            avg(at2.final_fare_value) over (partition by at2.customer_id) as aov,
            sum(at2.final_fare_value) over (partition by at2.customer_id) as rev
            from reports.agg_tickets at2
            ) ata
            on (ata.customer_id = at2.customer_id 
            and ata.initial_sale_date > at2.initial_sale_date 
            and ata.initial_sale_date < at2.initial_sale_date    interval '3 months')

I could use a left join lateral, but it takes far too long. Slightly confused with how to achieve what I want, so any help would be greatly appreciated.

Many thanks

Edit: Here is the sample of data. Picture of data table.

The table is unique on ticket number, but not on customer.

CodePudding user response:

No need to use a join at all, this will yield (as you observe) a problemetic performnce. What is your solution is a plain window function with a frame_clause that will consider the next 3 months for each ticket

Example (self explained)

count(*) over (partition by customer_id order by initial_sale_date 
               range between current row and '3 months'  following) ticket_cnt

Here a full query with simplified sample data and the result

with dt as (
select * from  (values
(1, 1, date'2020-01-01', 10),
(1, 2, date'2020-02-01', 15),
(1, 3, date'2020-03-01', 20),
(1, 4, date'2020-04-01', 25),
(1, 5, date'2020-05-01', 30),
(2, 6, date'2020-01-01', 15),
(2, 7, date'2020-02-01', 20),
(2, 7, date'2021-01-01', 25)
) tab (customer_id, ticket_id, initial_sale_date,final_fare_value)
)
select 
customer_id, ticket_id, initial_sale_date, final_fare_value,
count(*) over (partition by customer_id order by initial_sale_date range between current row and '3 months'  following) ticket_cnt,
sum(final_fare_value) over (partition by customer_id order by initial_sale_date range between current row and '3 months'  following) ticket_sum
from dt;

customer_id|ticket_id|initial_sale_date|final_fare_value|ticket_cnt|ticket_sum|
----------- --------- ----------------- ---------------- ---------- ---------- 
          1|        1|       2020-01-01|              10|         4|        70|
          1|        2|       2020-02-01|              15|         4|        90|
          1|        3|       2020-03-01|              20|         3|        75|
          1|        4|       2020-04-01|              25|         2|        55|
          1|        5|       2020-05-01|              30|         1|        30|
          2|        6|       2020-01-01|              15|         2|        35|
          2|        7|       2020-02-01|              20|         1|        20|
          2|        7|       2021-01-01|              25|         1|        25|
  • Related