I need to delete some rows in the dataset, of which the speed
equals zero and lasting over N times (let's assume N is 2).
The structure of the table demo
looks like:
id | car | speed | time |
---|---|---|---|
1 | foo | 0 | 1 |
2 | foo | 0 | 2 |
3 | foo | 0 | 3 |
4 | foo | 1 | 4 |
5 | foo | 1 | 5 |
6 | foo | 0 | 6 |
7 | bar | 0 | 1 |
8 | bar | 0 | 2 |
9 | bar | 5 | 3 |
10 | bar | 5 | 4 |
11 | bar | 5 | 5 |
12 | bar | 5 | 6 |
Then I hope to generate a table like the one below by using window_function
:
id | car | speed | time | lasting |
---|---|---|---|---|
1 | foo | 0 | 1 | 3 |
2 | foo | 0 | 2 | 3 |
3 | foo | 0 | 3 | 3 |
4 | foo | 1 | 4 | 2 |
5 | foo | 1 | 5 | 2 |
6 | foo | 0 | 6 | 1 |
7 | bar | 0 | 1 | 2 |
8 | bar | 0 | 2 | 2 |
9 | bar | 5 | 3 | 4 |
10 | bar | 5 | 4 | 4 |
11 | bar | 5 | 5 | 4 |
12 | bar | 5 | 6 | 4 |
Then I can easily exclude those rows by using WHERE NOT (speed = 0 AND lasting > 2)
Put the code I tried here, but it didn't return the value I expected and I guess those FROM (SELECT ... FROM (SELECT ...
might not be the best practice to solve the problem:
SELECT g3.*, count(id) OVER (PARTITION BY car, cumsum ORDER BY id) as num
FROM (SELECT g2.*, sum(grp2) OVER (PARTITION BY car ORDER BY id) AS cumsum
FROM (SELECT g1.*, (CASE ne0 WHEN 0 THEN 0 ELSE 1 END) AS grp2
FROM (SELECT g.*, speed - lag(speed, 1, 0) OVER (PARTITION BY car) AS ne0
FROM (SELECT *, row_number() OVER (PARTITION BY car) AS grp FROM demo) g ) g1 ) g2 ) g3
ORDER BY id;
CodePudding user response:
You can use window function LAG()
to check for the previous speed
value for each row and SUM()
window function to create the groups for the continuous values.
Then with COUNT()
window function you can count the number of rows in each group so that you can filter out the rows with 0 speed
in the groups that have more than 2 rows:
SELECT id, car, speed, time
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY car, grp) counter
FROM (
SELECT *, SUM(flag::int) OVER (PARTITION BY car ORDER BY time) grp
FROM (
SELECT *, speed <> LAG(speed, 1, speed - 1) OVER (PARTITION BY car ORDER BY time) flag
FROM demo
) t
) t
) t
WHERE speed <> 0 OR counter <= 2
ORDER BY id;
See the demo.