Home > Blockchain >  I need to calculate a value running down over time according to another value in bigquery
I need to calculate a value running down over time according to another value in bigquery

Time:08-16

In a BigQuery table, I need to refer to the previous value from the same column to calculate the new value for that column like this:

Timestamp Liters per Minute Liters remaining
2022-08-15 08:50 UTC 5 1000
2022-08-15 08:51 UTC 4 1000 - (avg(5, 4) = 4.5) = 995.5
2022-08-15 08:51 UTC 6 995.5 - (avg(4, 6) = 5) = 990.5
2022-08-15 08:51 UTC 6 990.5 - (avg(6, 6) = 6) = 984.5

Sometimes the 'Liters remaining' will come from outside, so I actually have used coalesce to choose either this external actual value if it exists, or the previous value in time for this column if it doesn't:

liters remaining estimated = coalesce(Liters remaining actual, lag(liters remaining estimated) - liters per minute)

but I can't refer to 'self' (liters remaining estimated) in the calculation (I think!)

Timestamp Liters per Minute Liters remaining actual Liters remaining estimated
2022-08-15 08:50 UTC 5 1000 1000
2022-08-15 08:51 UTC 4 1000 - (avg(5, 4) = 4.5) = 995.5
2022-08-15 08:51 UTC 6 995.5 - (avg(4, 6) = 5) = 990.5
2022-08-15 08:51 UTC 6 990.5 - (avg(6, 6) = 6) = 984.5
2022-08-15 08:51 UTC 6 986 986
2022-08-15 08:51 UTC 3 986 - (avg(6, 3) = 4.5) 981.5

I'm not sure how to split this up or if there is some magic bq sql I'm about to learn :)

CodePudding user response:

Consier below query. I've slightly modified your liters remaining logic like below and applied kind of cumulative sum analytic function to generate expected output.

  • 1000 - SUM(-5, 5) /2 = 1000
  • 1000 - SUM(5, 4) / 2 = 995.5
  • 1000 - SUM(5, 4, 4, 6) / 2 = 990.5
  • 1000 - SUM(5, 4, 4, 6, 6, 6) / 2 = 984.5

actual_liters column is used to partition a table whenever new actual liters is given.

WITH partitioned AS (
  SELECT *,
         LAST_VALUE(Liters_remaining_actual IGNORE NULLS) OVER w AS actual_liters,
         IF(Liters_remaining_actual IS NULL, 
            LAG(Liters_per_Minute) OVER w,  -1 * Liters_per_Minute
         ) AS pre_liters_per_minute,
    FROM sample_table
  WINDOW w AS (ORDER BY Timestamp)
)
SELECT *,
       actual_liters - SUM(Liters_per_Minute   pre_liters_per_minute) OVER w / 2 AS Liters_remaining_estimated
  FROM partitioned
WINDOW w AS (PARTITION BY actual_liters ORDER BY Timestamp)
 ORDER BY Timestamp;

enter image description here

  • Related