Home > Enterprise >  "partition by" giving incorrect value
"partition by" giving incorrect value

Time:09-19

battery_pct   tstamp    charging    phone_id
90            t1        yes         12
91            t2        yes         22
95            t3        no          22 
89            t4        no          22
87            t5        no          22
80            t6        no          22
78            t7        yes         22
85            t8        yes         4
50            t9        no          4
40            t10       no          4
38            t11       no          4
20            t12       yes         4

I want to calculate battery depletion rate as : change in battery / time taken

This should be calculated for ALL the windows when charging is 'no' (sandwiched in between 2 "yes"), and then the average of those rates should be taken.

So, for this dataset it should be:

95 - 80 / t6 - t3 = rate for phone_id 22

50 - 38 / t11 - t9 = rate for phone_id 4

average rate = ( rate 1 rate 2 ) / 2

Please note there can be more than one windows of no's for each phone_id in the data. I have to find average rate across ALL phone id's. i.e. one value for average rate which encompasses all phones.

Here is my current code, it does not give any error, but is returning a value that is NOT plausible -

with discharge_intervals as (
  select battery_pct, tstamp, 
         sum((charging = 'yes')::int) over (partition by phone_id order by tstamp) as ival_number,
         charging = 'no' as keep
    from dataset
), interval_rates as (
  select ival_number, 
         (max(battery_pct) - min(battery_pct)) 
           / extract(epoch from max(tstamp) - min(tstamp)) as ival_rate
    from discharge_intervals
   where keep
   group by ival_number
)
select avg(ival_rate) 
  from interval_rates;

CodePudding user response:

Your interval_rates are calculated without grouping by phone, but should be. The ival_numbers are partitioned by phone_id, but that just means multiple phones will create rows with the same ival_number. You'll want to use

with discharge_intervals as (
  select battery_pct, tstamp, phone_id,
--                            ^^^^^^^^^
         sum((charging = 'yes')::int) over (partition by phone_id order by tstamp) as ival_number,
         charging = 'no' as keep
    from dataset
), interval_rates as (
  select (max(battery_pct) - min(battery_pct)) 
           / extract(epoch from max(tstamp) - min(tstamp)) as ival_rate
    from discharge_intervals
   where keep
   group by phone_id, ival_number
--          ^^^^^^^^^
)
select avg(ival_rate) 
  from interval_rates;
  • Related