Home > Enterprise >  Get the average difference between two values in time period
Get the average difference between two values in time period

Time:02-14

What I have: A single table with ids and prices for date and time.

What I want to do: To get the average difference of price for every id between the first and 15 minute of every day. Conditions: I also want to use the last available price in 15 minutes if the 15 is missing. For example if 15 is missing, use the price in 14 minutes.

What I did: My first step was to get the price after 15 minutes of every day for every id.

My code (I did a lot of tests with it, sorry if it doesn't look good and optimised enough) to get the last price(but not last available) in the 15 minutes interval:

WITH s as (
  SELECT DATE(date_value) as dt, date_value, stock_id sid, open FROM stocks_1min
)

SELECT sid, dt, date_value, s.open as second_price
FROM s

INNER JOIN (
    SELECT q.stock_id, DATE(q.date_value) as dat
    FROM stocks_1min q
    GROUP BY q.stock_id, dat
    ORDER BY q.stock_id ASC
  ) as end_price
           ON s.sid = end_price.stock_id 
    AND s.date_value = (SELECT date_value FROM s WHERE s.sid = end_price.stock_id AND s.dt = end_price.dat LIMIT 1)   interval '15 minutes'    

ORDER BY sid, dt;

What I didn't do:

  • get the last AVAILABLE price in that interval.
  • get the first price in that interval and use it in the same query with the last price in the interval
  • If I got both prices then I can use AVG and Group By to get the needed value.

Expected result (example): Table:

|stock_id|open(price)|date_value      |
|1       |10         |01-12-2020 09:31|
|1       |12         |01-12-2020 09:46|
|1       |14         |02-12-2020 09:31|
|1       |20         |02-12-2020 09:46|
|2       |14         |01-12-2020 09:31|
|2       |16         |01-12-2020 09:46|
|2       |2          |02-12-2020 09:31|
|2       |16         |02-12-2020 09:46|

Result:

|stock_id|average_difference|
| 1      | 4                | ((12-10)   (20-14)) / 2 = 4
| 2      | 8                | ((16-14)   (16-2)) / 2 = 8

I have never used "Partition By", but maybe here is the place to start with it? It's my own project, where I want to test some statistical ideas for the stock market. Thanks in advance!

You can use this link where it's a sample for testing: http://sqlfiddle.com/#!17/83597/1

CodePudding user response:

If I have understood correctly, you want to:

  1. List the prices of each stock from each day during the first 15 minutes of the day
  2. Identify the first and last (as defined by date_value) of the prices listed in step 1
  3. Calculate the difference between the prices identified in step 2, by stock and day
  4. Calculate the average of the differences calculated in step 3, by stock

If I have interpreted your question correctly, then:

WITH dat
AS
(
SELECT stock_id
      ,date_value::date AS date_of_date_value
      ,date_value
      ,MIN(date_value) OVER (PARTITION BY stock_id, date_value::date) AS min_date_value
      ,MAX(date_value) OVER (PARTITION BY stock_id, date_value::date) AS max_date_value
      ,"open" AS open_price
FROM stocks_1min
WHERE date_value::time <= '09:45:00.000'
),
dat2
AS
(
SELECT stock_id
      ,date_of_date_value
      ,AVG(CASE WHEN min_date_value = date_value THEN open_price ELSE NULL END) AS open_price_from
      ,AVG(CASE WHEN max_date_value = date_value THEN open_price ELSE NULL END) AS open_price_to
      ,AVG(CASE WHEN max_date_value = date_value THEN open_price ELSE NULL END) 
           - AVG(CASE WHEN min_date_value = date_value THEN open_price ELSE NULL END) AS open_price_diff
FROM dat
WHERE date_value = min_date_value OR date_value = max_date_value
GROUP BY stock_id
        ,date_of_date_value
)
SELECT stock_id, AVG(open_price_diff) AS avg_open_price_diff
FROM dat2
GROUP BY stock_id;

You may find that you need to wrap the calculation of open_price_diff with ABS, but I have not done that in the example above.

sqlfiddle

CodePudding user response:

I continued with the idea from your SQL and tried to get the solution without analytical function which you can if you want to. Here are the steps:

Step 1: Minimum recorded time for each day.

with day_min_time
as
(
 SELECT Min(date_value)  min_time,
       Date(date_value) dt,
       stock_id
FROM   stock_1min
GROUP  BY Date(date_value),
          stock_id 
)

Step 2: Maximum recorded time for each day which is 15 minutes from minimum recorded time for the day.It is using the CTE from step 1.

with day_max_time
 as
(
 SELECT   Max(sd.date_value)  max_time ,
         Date(sd.date_value) dt,
         sd.stock_id
FROM     stock_1min sd
JOIN     day_min_time dmt
ON       sd.stock_id = dmt.stock_id
AND      Date(sd.date_value) = dmt.dt
WHERE    sd.date_value <= dmt.min_time   interval '15 minutes'
GROUP BY date(sd.date_value) ,
         sd.stock_id)

Step 3: Use the time boundaries obtained from Step 1 and Step 2 to get the maximum and minimum prices.

SELECT Max(sd.open) - Min(sd.open) price_difference,
       sd.stock_id,
       DATE(sd.date_value)
FROM   stock_1min sd
       INNER JOIN day_min_time min_dmt
               ON min_dmt.stock_id = sd.stock_id
                  AND DATE(sd.date_value) = min_dmt.dt
       INNER JOIN day_max_time max_dmt
               ON max_dmt.stock_id = sd.stock_id
                  AND DATE(sd.date_value) = max_dmt.dt
WHERE  sd.date_value BETWEEN min_dmt.min_time AND max_dmt.max_time
GROUP  BY sd.stock_id,
          DATE(sd.date_value) 

Step 4: Get the average difference using the result from step 3:

SELECT Round(Avg(difference)) average_difference,
       stock_id
FROM   (SELECT Max(sd.open) - Min(sd.open) price_difference,
               sd.stock_id,
               DATE(sd.date_value)
        FROM   stock_1min sd
               INNER JOIN day_min_time min_dmt
                       ON min_dmt.stock_id = sd.stock_id
                          AND DATE(sd.date_value) = min_dmt.dt
               INNER JOIN day_max_time max_dmt
                       ON max_dmt.stock_id = sd.stock_id
                          AND DATE(sd.date_value) = max_dmt.dt
        WHERE  sd.date_value BETWEEN min_dmt.min_time AND max_dmt.max_time
        GROUP  BY sd.stock_id,
                  DATE(sd.date_value)) final_inline_view
GROUP  BY stock_id; 

Final SQL:Combining Everything:

WITH 
day_min_time AS
(
         SELECT   Min(date_value)  min_time,
                  Date(date_value) dt,
                  stock_id
         FROM     stock_1min
         GROUP BY Date(date_value),
                  stock_id ), 
day_max_time AS
(
         SELECT   Max(sd.date_value)  max_time ,
                  Date(sd.date_value) dt,
                  sd.stock_id
         FROM     stock_1min sd
         JOIN     day_min_time dmt
         ON       sd.stock_id = dmt.stock_id
         AND      Date(sd.date_value) = dmt.dt
         WHERE    sd.date_value <= dmt.min_time   interval '15 minutes'
         GROUP BY date(sd.date_value) ,
                  sd.stock_id )
SELECT   round(avg(price_difference)) average_difference,
         stock_id
FROM     (
                    SELECT     max(sd.OPEN)-min(sd.OPEN) price_difference,
                               sd.stock_id ,
                               date(sd.date_value)
                    FROM       stock_1min sd
                    INNER JOIN day_min_time min_dmt
                    ON         min_dmt.stock_id = sd.stock_id
                    AND        date(sd.date_value) = min_dmt.dt
                    INNER JOIN day_max_time max_dmt
                    ON         max_dmt.stock_id = sd.stock_id
                    AND        date(sd.date_value) = max_dmt.dt
                    WHERE      sd.date_value BETWEEN min_dmt.min_time AND        max_dmt.max_time
                    GROUP BY   sd.stock_id ,
                               date(sd.date_value) ) final_inline_view
GROUP BY stock_id;

Output:

 average_difference | stock_id
-------------------- ----------
                  4 |        1
                  8 |        2
(2 rows)

Table:

create table stock_1min(stock_id int , open int , date_value timestamp);
insert into stock_1min values
(1,10,'2020-12-01 09:31'),
(1,12,'2020-12-01 09:46'),
(1,14,'2020-12-02 09:31'),
(1,20,'2020-12-02 09:46'),
(2,14,'2020-12-01 09:31'),
(2,16,'2020-12-01 09:46'),
(2,2,'2020-12-02 09:31'),
(2,16,'2020-12-02 09:46');

postgres=# select * from stock_1min;
 stock_id | open |     date_value
---------- ------ ---------------------
        1 |   10 | 2020-12-01 09:31:00
        1 |   12 | 2020-12-01 09:46:00
        1 |   14 | 2020-12-02 09:31:00
        1 |   20 | 2020-12-02 09:46:00
        2 |   14 | 2020-12-01 09:31:00
        2 |   16 | 2020-12-01 09:46:00
        2 |    2 | 2020-12-02 09:31:00
        2 |   16 | 2020-12-02 09:46:00
(8 rows)

  • Related