I have a table,
Name Seconds Status_measure
a 0 10
a 10 13
a 20 -1
a 30 15
a 40 20
a 50 12
a 60 -1
Here I want for a particular name a new column which is calculated by, "The number of times the value goes >-1 only after once the -1 is met" . So in this particular data I want a new column for the name "a" which has the value=3 , because once the -1 is reached in Status_measure, we have 3 values (15 and 20 and 12)>-1
Required data frame:
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
I tried doing
count(status_measure>-1) over (partition by name order by seconds)
But this is not giving any desired result
CodePudding user response:
You can do it in 2 steps, group data, count entries of the grp = 1.
select *, sum(Status_measure > -1 and grp = 1) over(partition by name) n
from (
select *
, row_number() over(partition by name order by Seconds) - sum(Status_measure > -1 ) over(partition by name order by Seconds) grp
from tbl
) t
CodePudding user response:
An option is using a variable update, which:
- starts from 0
- increases its value when reaches a -1
- decreases its value when reaches a second -1
Once you have this column, you can run a sum over your values.
SET @change = 0;
SELECT *, SUM(CASE WHEN Status_measure = -1
THEN IF(@change=0, @change := @change 1, @change := @change - 1)
ELSE @change END) OVER() -1 AS Value_
FROM tab
Check the demo here.
Limitations: this solution assumes you have only one range of interesting values between -1s.
Note: there's a -1 decrement from your sum because the first update of the variable will leave 1 in the same row of -1, which you don't want. For better understanding, comment out the application of SUM() OVER
and see intermediate output.
CodePudding user response:
More of a clarification to your question first. I want to expand your original data to include another row for the sake of 2 vs 3 entries. Also, is there some auto-increment ID in your data that the sequential consideration is applicable such as
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
If sequential, and you have IDs 1 & 2 above the -1 at ID #3. This would indicate two entries. But then for IDs 4-6 above -1 have a count of three entries before ID #7.
So, what "VALUE" do you want to have in your result. The max count of 3 for all rows, or would it be a value of 2 for ID#s 1, 2 and 3? And value of 3 for Ids 4-7? Or, do you want ALL entries to recognize the greatest count before -1 measure to show 3 for all entries.
Please EDIT your question, you can copy/paste this in your original question if need be and provide additional clarification as requested (auto-increment as well as that is an impact of final output / determining break).