I have a table games
with values such as:
---------- ------
| game | year |
---------- ------
| Football | 1999 |
| Football | 2000 |
| Football | 2001 |
| Football | 2002 |
| Cricket | 1996 |
| Tennis | 2001 |
| Tennis | 2002 |
| Tennis | 2003 |
| Tennis | 2009 |
| Golf | 1994 |
| Golf | 1996 |
| Golf | 1997 |
---------- ------
I am trying to see if a game has an entry with a minimum three consecutive years in the table. My expected output is:
----------
| game |
----------
| Football |
| Tennis |
----------
Because:
- Football has four entries out of which four are consecutive years =>
1999, 2000, 2001, 2002
- Tennis has four entries out of which three are consecutive years =>
2001, 2002, 2003
In order to find the rows with a minimum three consecutive entries I first partitioned the table on game
and then checked difference between the current and the next row as below:
select game, year, case
when (year - lag(year) over (partition by game order by year)) is null then 1
else year - lag(year) over (partition by game order by year)
end as diff
from games
Output of the above query:
---------- ------ ------
| game | year | diff |
---------- ------ ------
| Football | 1999 | 1 |
| Football | 2000 | 1 |
| Football | 2001 | 1 |
| Football | 2002 | 1 |
| Cricket | 1996 | 1 |
| Tennis | 2001 | 1 |
| Tennis | 2002 | 1 |
| Tennis | 2003 | 1 |
| Tennis | 2009 | 6 |
| Golf | 1994 | 1 |
| Golf | 1996 | 2 |
| Golf | 1997 | 1 |
---------- ------ ------
I am not able to proceed from here on getting the output by filtering the data for each game with its difference.
Could anyone let me know if I am in the right track of the implementation? If not, how do I prepare the query to get the expected output?
CodePudding user response:
You could use a self join approach here:
SELECT DISTINCT g1.Game
FROM games g1
INNER JOIN games g2
ON g2.Game = g1.Game AND g2.Year = g1.Year 1
INNER JOIN games g3
ON g3.Game = g2.Game AND g3.Year = g2.Year 1;
Demo
The above query requires any matching game to have at least one record whose year can be found in the following year, and the year after that as well.
CodePudding user response:
You can use lag()
and lead()
and compare them to the current Year:
with u as
(select *, case
when lag(Year) over(partition by Game order by Year) = Year - 1
and lead(Year) over(partition by Game order by Year) = Year 1
then 1 else 0
end as consec
from games)
select distinct Game
from u
where consec = 1;
CodePudding user response:
Yes, your initial approach is correct. You were actually really close to fully figuring it out yourself.
What I would do is alter LAG
a bit:
year - LAG(year, 2) OVER (
PARTITION BY game
ORDER BY year
ROWS BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
)
For each row, this will compare the difference between the year from current row and the year from (current - 2)th row.
If it is the third consecutive row it will yield 2
which you can filter in where
clause.
If your data contains duplicates you need to group by game, year
first.
CodePudding user response:
Using CTE(Common Table Expression) and the useful ROW_NUMBER window function this can be easily solved.
WITH CTE (name, RN) AS (
select name, ROW_NUMBER() OVER (PARTITION BY name order by year) RN
from game)
Select Distinct name
from CTE
Where RN >= 3