I have a table that looks like this:
------- ------
|naming | year |
------- ------|
| A. | 2003 |
| B | 1994 |
| B | 1996 |
| A. | 2001 |
| A. | 2002 |
| A. | 1999 |
| B. | 1999 |
I want to filter out all names that occur at least thrice and for which the year is consecutive. For example, here: A occurs thrice in conseucutive years (2001,2002, 2003) so I want to return A. How can I achieve this in sql?
CodePudding user response:
SELECT DISTINCT t1.*
FROM test AS t1
JOIN test AS t2 ON t1.year = t2.year-1
JOIN test AS t3 ON t1.year = t3.year-2
And the output will be
CodePudding user response:
You can achieve this by joining the table with itself:
SELECT
w.naming
, w.year || '-' || (w.year 2) AS year_range
FROM words w
INNER JOIN words w2 ON w.naming = w2.naming
WHERE w2.year BETWEEN w.year AND w.year 2
GROUP BY w.naming, w.year
HAVING COUNT(*) >= 3
naming | year_range |
---|---|
A. | 2001-2003 |
This query joins a row with same wording and 3 consecutive years. Then groups the consecutive years and keeps only the rows with at least 3 consecutive years.
UPDATE:
Given this dataset:
CREATE TABLE words (
naming VARCHAR(255) NOT NULL,
year INT NOT NULL
);
INSERT INTO words VALUES
('A.', 2003),
('B.', 1994),
('B.', 1996),
('A.', 2001),
('A.', 2002),
('A.', 1999),
('B.', 1999),
('C.', 2000),
('C.', 2000),
('C.', 2001),
('C.', 2002);
This query will only return namings with only 3 consecutive years:
SELECT
w.naming
, w.year || '-' || (w.year 2) AS year_range
FROM (
SELECT *
FROM words
WHERE naming IN(
SELECT naming
FROM words
GROUP BY naming
HAVING COUNT(DISTINCT year) = 3
)
) AS w
INNER JOIN words w2 ON w.naming = w2.naming
WHERE w2.year BETWEEN w.year AND w.year 2
GROUP BY w.naming, w.year
HAVING COUNT(*) >= 3;