I have data consisting of student information, a student GPA category consisting of two categories ('GPA < 2.5'; 'GPA OK'), and the term associated with the GPA. I want to create a flag column that assigns a static value (0) to the flag for rows where GPA Category equals 'GPA OK', another value when GPA Category equals 'GPA < 2.5' (1), and another value when 'GPA < 2.5' occurs twice in a row after ordering the data by student and term (2). See an example of what I want to accomplish below:
Student Term GPA Category FLAG
StudentA Fall 2021 GPA OK 0 -- Flag for when GPA OK occurs
StudentA Spring 2022 GPA < 2.5 1 -- Flag for when GPA < 2.5 occurs
StudentA Summer 2022 GPA < 2.5 2 -- Flag for when GPA < 2.5 occurs at least twice in a row
StudentB Fall 2021 GPA < 2.5 1
StudentB Summer 2021 GPA < 2.5 2
StudentB Fall 2021 GPA < 2.5 2
StudentC Fall 2021 GPA < 2.5 1
StudentC Summer 2022 GPA < 2.5 2
StudentC Fall 2022 GPA OK 0
StudentC Spring 2023 GPA < 2.5 1
StudentC Summer 2023 GPA OK 0
StudnetC Fall 2023 GPA OK 0
In case it is relevant, the Student, Term, and GPA Category fields are fields that I have calculated in the Select statement using the CONCAT, iif, and CASE function, respectively. Can anybody help me out here? This one has me stumped.
CodePudding user response:
You can use LAG
, I'm assuming you either have a RowNumber or can generate one to keep these records in the desired order. I would also suggest moving your existing SELECT
statement to a CTE since you note both the Student and GPA_Category are calculated, that way you can call them in the final SELECT
as values.
DECLARE @t1 TABLE (RowN INT, Student VARCHAR(50), Term VARCHAR(20), [GPA_Category] VARCHAR(20))
INSERT INTO @t1
VALUES
(1,'StudentA', 'Fall 2021', 'GPA OK'),
(2,'StudentA', 'Spring 2022', 'GPA < 2.5'),
(3,'StudentA', 'Summer 2022', 'GPA < 2.5'),
(4,'StudentB', 'Fall 2021', 'GPA < 2.5'),
(5,'StudentB', 'Summer 2021', 'GPA < 2.5'),
(6,'StudentB', 'Fall 2021', 'GPA < 2.5'),
(7,'StudentC', 'Fall 2021', 'GPA < 2.5'),
(8,'StudentC', 'Summer 2022', 'GPA < 2.5'),
(9,'StudentC', 'Fall 2022', 'GPA OK'),
(10,'StudentC', 'Spring 2023', 'GPA < 2.5'),
(11,'StudentC', 'Summer 2023', 'GPA OK'),
(12,'StudentC', 'Fall 2023', 'GPA OK')
SELECT *,
CASE WHEN LAG(Student, 1) OVER (ORDER BY RowN) = Student
AND LAG(GPA_Category,1) OVER (ORDER BY RowN) LIKE 'GPA < 2.5'
AND GPA_Category LIKE 'GPA < 2.5' THEN 2
WHEN GPA_Category LIKE 'GPA < 2.5' THEN 1
WHEN GPA_Category LIKE 'GPA OK' THEN 0 END AS Flag
FROM @t1 t