Home > Software engineering >  Sql query to count values after a particular condition is met
Sql query to count values after a particular condition is met

Time:08-06

I have a table,

Name   Seconds  Status_measure
a      0           10
a      10          13
a      20          -1
a      30          15
a      40          20
a      50          12
a      60          -1

Here I want for a particular name a new column which is calculated by, "The number of times the value goes >-1 only after once the -1 is met" . So in this particular data I want a new column for the name "a" which has the value=3 , because once the -1 is reached in Status_measure, we have 3 values (15 and 20 and 12)>-1

Required data frame:

Id   Name   Seconds  Status_measure    Value
1    a      0           10                3
2    a      10          13                3
3    a      20          -1                3
4    a      30          15                3
5    a      40          20                3
6    a      50          12                3
7    a      60          -1                3

I tried doing

count(status_measure>-1) over (partition by name order by seconds)

But this is not giving any desired result

CodePudding user response:

You can do it in 2 steps, group data, count entries of the grp = 1.

  select *, sum(Status_measure > -1 and grp = 1) over(partition by name) n
  from (
    select *
    , row_number() over(partition by name order by Seconds) - sum(Status_measure > -1 ) over(partition by name order by Seconds) grp
    from tbl 
   ) t 

CodePudding user response:

An option is using a variable update, which:

  • starts from 0
  • increases its value when reaches a -1
  • decreases its value when reaches a second -1

Once you have this column, you can run a sum over your values.

SET @change = 0;

SELECT *, SUM(CASE WHEN Status_measure = -1
                   THEN IF(@change=0, @change := @change   1, @change := @change - 1)
                   ELSE @change END) OVER() -1 AS Value_
FROM tab 

Check the demo here.

Limitations: this solution assumes you have only one range of interesting values between -1s.

Note: there's a -1 decrement from your sum because the first update of the variable will leave 1 in the same row of -1, which you don't want. For better understanding, comment out the application of SUM() OVER and see intermediate output.

CodePudding user response:

More of a clarification to your question first. I want to expand your original data to include another row for the sake of 2 vs 3 entries. Also, is there some auto-increment ID in your data that the sequential consideration is applicable such as

Id   Name   Seconds  Status_measure    Value
1    a      0           10                3
2    a      10          13                3
3    a      20          -1                3
4    a      30          15                3
5    a      40          20                3
6    a      50          12                3
7    a      60          -1                3

If sequential, and you have IDs 1 & 2 above the -1 at ID #3. This would indicate two entries. But then for IDs 4-6 above -1 have a count of three entries before ID #7.

So, what "VALUE" do you want to have in your result. The max count of 3 for all rows, or would it be a value of 2 for ID#s 1, 2 and 3? And value of 3 for Ids 4-7? Or, do you want ALL entries to recognize the greatest count before -1 measure to show 3 for all entries.

Please EDIT your question, you can copy/paste this in your original question if need be and provide additional clarification as requested (auto-increment as well as that is an impact of final output / determining break).

  • Related