I have T1 which is formatted as
STUDENT SCORE DATE
1 6 2022-02-01
1 0 2022-03-12
1 5 2022-04-30
1 1 2022-04-30
1 1 2022-05-14
1 1 2022-05-19
1 8 2022-05-26
2 9 2022-01-02
2 10 2022-04-11
2 2 2022-04-12
2 0 2022-04-17
2 7 2022-05-08
2 4 2022-05-12
3 10 2022-01-09
3 2 2022-02-11
3 6 2022-03-16
3 3 2022-03-18
3 2 2022-04-02
3 9 2022-04-27
4 4 2022-02-24
4 0 2022-02-26
4 9 2022-02-28
4 2 2022-03-27
4 8 2022-04-02
4 4 2022-04-14
5 3 2022-01-28
5 5 2022-02-12
5 6 2022-02-18
5 0 2022-02-21
5 4 2022-04-05
XX 0.711094564 2022-02-28
XX 0.60584994 2022-03-31
XX 0.087965016 2022-04-30
YY 0.497937992 2022-02-28
YY 0.727796963 2022-03-31
YY 0.974471085 2022-04-30
YY 0.780187398 2022-05-30
First I wish to extract the UNIQUE values of DATE when STUDENT equals to XX and YY. There are some duplicates. These are the trimester ending dates.
Then I wish to SUM up the SCORE values for each STUDENT across the date ranges from Step 1:
And together I wish to generate this table T2
STUDENT SCORE DATE.START DATE.END
1 6 2022-01-01 2022-02-28
1 0 2022-03-01 2022-03-31
1 6 2022-04-01 2022-04-30
1 10 2022-05-01 2022-05-30
2 9 2022-01-01 2022-02-28
2 NA 2022-03-01 2022-03-31
2 12 2022-04-01 2022-04-30
2 11 2022-05-01 2022-05-30
3 12 2022-01-01 2022-02-28
3 9 2022-03-01 2022-03-31
3 11 2022-04-01 2022-04-30
3 NA 2022-05-01 2022-05-30
4 13 2022-01-01 2022-02-28
4 2 2022-03-01 2022-03-31
4 12 2022-04-01 2022-04-30
4 NA 2022-05-01 2022-05-30
5 14 2022-01-01 2022-02-28
5 NA 2022-03-01 2022-03-31
5 4 2022-04-01 2022-04-30
5 NA 2022-05-01 2022-05-30
And I try this,
WITH T2 AS(
SELECT DISTINCT(DATE),
COALESCE(LEAD(DATE) OVER(ORDER BY DATE) AS DATE.START,
ROW_NUMBER() OVER (ORDER BY DATE) AS DATE.END
FROM T1 WHERE (STUDENT = 'XX' OR STUDENT = 'YY')
)
SELECT STUDENT, SUM(SCORE), DATE.START, DATE.END
FROM T1 JOIN T2 ON T1.DATE.START BETWEEN DATE.START AND DATE.END
CodePudding user response:
Your date ranges are just month boundaries, so you can use ordinary GROUP BY
SELECT student, DATE_FORMAT(date, '%Y-%m') AS year_month, SUM(score) AS score
FROM T1
WHERE student IN ('XX', 'YY')
GROUP BY student, year_month
CodePudding user response:
Here we create a calendar divided to trimesters and join it with our table matching each date to the correct trimester in the calendar.
DECLARE @Year INT = '2019';
DECLARE @YearCnt INT = 4 ;
DECLARE @StartDate DATE = DATEFROMPARTS(@Year, 01, 01)
DECLARE @EndDate DATE = DATEADD(DAY, -1, DATEADD(YEAR, @YearCnt, @StartDate));
;WITH Cal(n) AS
(
SELECT 0 UNION ALL SELECT n 4 FROM Cal
WHERE n < DATEDIFF(month, @StartDate, @EndDate)
),
FnlDt(d) AS
(
SELECT DATEADD(month, n, @StartDate) FROM Cal
),
FinalCte AS
(
SELECT
[Date] = CONVERT(DATE,d)
FROM FnlDt
)
select STUDENT
,sum(score) as total_score
,semester_start
,semester_end
from (
select date as semester_start
,lead(dateadd(day, -1, date)) over(order by date) as semester_end
from finalCte
) t2 join t on t.date between semester_start and semester_end
group by STUDENT, semester_start, semester_end
STUDENT | total_score | semester_start | semester_end |
---|---|---|---|
1 | 12 | 2022-01-01 | 2022-04-30 |
1 | 10 | 2022-05-01 | 2022-08-31 |
2 | 21 | 2022-01-01 | 2022-04-30 |
2 | 11 | 2022-05-01 | 2022-08-31 |
3 | 32 | 2022-01-01 | 2022-04-30 |
4 | 27 | 2022-01-01 | 2022-04-30 |
5 | 18 | 2022-01-01 | 2022-04-30 |
CodePudding user response:
In this one we're looking at score by month so we don't really need the bottom half of the information. In fact, it makes it better not to have it cause then we won't be able to save student
and score
as int.
select student
,sum(score) as total_score
,max(dateadd(day, 1, eomonth(date, -1))) as "DATE.START"
,max(eomonth(date)) as "DATE.END"
from t
group by student, year(date), month(date)
student | total_score | DATE.START | DATE.END |
---|---|---|---|
1 | 6 | 2022-02-01 | 2022-02-28 |
1 | 0 | 2022-03-01 | 2022-03-31 |
1 | 6 | 2022-04-01 | 2022-04-30 |
1 | 10 | 2022-05-01 | 2022-05-31 |
2 | 9 | 2022-01-01 | 2022-01-31 |
2 | 12 | 2022-04-01 | 2022-04-30 |
2 | 11 | 2022-05-01 | 2022-05-31 |
3 | 10 | 2022-01-01 | 2022-01-31 |
3 | 2 | 2022-02-01 | 2022-02-28 |
3 | 9 | 2022-03-01 | 2022-03-31 |
3 | 11 | 2022-04-01 | 2022-04-30 |
4 | 13 | 2022-02-01 | 2022-02-28 |
4 | 2 | 2022-03-01 | 2022-03-31 |
4 | 12 | 2022-04-01 | 2022-04-30 |
5 | 3 | 2022-01-01 | 2022-01-31 |
5 | 11 | 2022-02-01 | 2022-02-28 |
5 | 4 | 2022-04-01 | 2022-04-30 |
CodePudding user response:
Unfortunately I do not have at this moment access to a MySQL, but only to a MSSQL server. So, I´ll present a solution for that platform. As I mentioned in the comments, the question is ill-posed, because it is not clear what is the definition of a "trimester", nor are all the datatypes very well defined (i.e., NA should be null or NA?). Nonetheless, I solved the query to present a solution identical to what is presented in the example. And it can be easily modified to accommodate any variations on the theme. Here it goes... (1st part and 2nd part):
Select Distinct Date From T1 Where Student = 'XX' or Student = 'YY' Order By date
Select t2.Student as Student,
Coalesce(Cast(Sum(Case When (T1.date is null) Then null
When (T1.date >= t2.Di and t1.date <= t2.Df) Then t1.score
Else 0
End) as varchar), 'NA') as Score,
t2.Di as DateStart,
t2.Df as DateEnd
From (Select Student, Di, Df
From (Select Distinct Student From t1 Where Student <> 'XX' and Student <> 'YY') as ta,
(Select Distinct Di=Cast(Case When DatePart(Month, date) = 2 Then DateAdd(MONTH,DateDiff(Month,0,date),-31)
Else DateAdd(Month,DateDiff(Month,0,date),0)
End as Date),
Df=Date
from T1 where student = 'XX' or student = 'YY') as tb
) as t2
Left Join T1
On t2.student = T1.student and T1.date >= t2.di and T1.date <= t2.df
Group By t2.student, t2.di, t2.df
Order By t2.student Asc, t2.di Asc
Well, anyone who runs the above code over a T1 table as shown in the question will receive the exact solution as presented in the question. Enjoy it.