Home > Back-end >  MySQL - Task Manager - S-Curve - Summarizing tasks
MySQL - Task Manager - S-Curve - Summarizing tasks

Time:12-28

I'm bulding a Project Manager System using PHP and MySQL, among other languages, and I'm strugling to query my data in order to create a S-Curve Chart. In my data base I have the following table named "Tasks"

ID Task Beginning_Date End_date Status
1 Task-1 2021-12-01 2021-12-06 Pendent
2 Task-2 2021-12-02 2021-12-07 Completed
3 Task-3 2021-12-03 2021-12-07 Completed

To draw the chart, I need to have a row for every day between the minimum date and the maximum date of the Task Table table, even if this exact date does not exist in the task table (example: 2021-12-04 - it is not written on the table, but it happens between 2021-12-01 and 2021-12-07). And, for every day, I need to count the duration of all the tasks in days that exist before the date and the same for only the completed tasks (I'll try to explain better with an example):

Day Days_Tasks Days_Tasks_Completed Observation:
2021-12-01 (first day from the Task Table) 1 0 Counting: Only one day of task-1
2021-12-02 3 1 Days_Task: 2 days of task1 1 day of task2 Completed_Taks: 1 day of Task2
2021-12-03 6 3 Days_Task: 3 days of task1 2d of task2 1d task3 Completed_Taks: 2 day of Task2 and 1d task3 (it does not count the days from task1 becouse it is not "Completed"
2021-12-04 9 5 Days_Task: 4days of task1 plus 3d of task2 plus 2d task3 Completed_Taks: 3d of task2 and 2d task3
2021-12-05 12 7 Days_Task: 5d of task1 plus 4d of task2 plus 3d task3 Completed_Taks: 4d of task2 plus 3d task3
2021-12-06 15 9 Days_Task: 6d of task1 plus 5d of task2 plus 4d task3 Completed_Taks: 5d of task2 plus 4d task3
2021-12-07 (last day) 17 11 Days_Task: 6d of task1 plus 6d of task2 plus 5d task3 Completed_Taks: 6d of task2 plus 5d task3

Explanation: The last day (2020-12-07) is the last date that appear on the Task Table. For that day, I'd have do count 17 "Days Task" (6 days for the Task 1 becouse all the 6 days of duration happend before 2020-12-07 all the days for Task2 all the day for the duration of task 3) The same goes for the completed "Days Task" but only counting day of completed tasks - for that reason, not considering Task 1 since it is "Pendent").

I honestly, have no idea how to perform that using MySQL. I tryed a bunch of queries and nothing seemed to work as I wanted.

I was able to get a Table with all the dates using:

@i<-1;
SELECT DATE(ADDDATE((SELECT MIN(Beginning_Date) from tasks), INTERVAL @i:=@i 1 DAY)) AS date FROM `tasks`
HAVING 
@i < DATEDIFF((SELECT MAX(End_date) from tasks), (SELECT MIN(Beginning_Date) from tasks )))

However, I have no idea how to count the tasks for these dates.

Does anyone knows how to do it?

CodePudding user response:

There appear to be some errors in your table of desired output. If I understand correctly the following query works but this requires MySQL 8. The first CTE is your example data. The second CTE is a recursive CTE to provide the date range. Then the query does some aggregation based on the dates and statuses.

WITH RECURSIVE `Tasks` (`ID`, `Task`, `Beginning_Date`, `End_date`, `Status`) AS
(
    SELECT 1, 'Task-1', DATE('2021-12-01'), DATE('2021-12-06'), 'Pendent' UNION ALL
    SELECT 2, 'Task-2', DATE('2021-12-02'), DATE('2021-12-07'), 'Completed' UNION ALL
    SELECT 3, 'Task-3', DATE('2021-12-03'), DATE('2021-12-07'), 'Completed'
),
`dates` (`date`) AS
(
  SELECT MIN(`Beginning_Date`) FROM `Tasks`
  UNION ALL
  SELECT `date`   INTERVAL 1 DAY FROM `dates`
  WHERE `date`   INTERVAL 1 DAY <= (SELECT MAX(`End_date`) FROM `Tasks`)
)
SELECT
    `d`.`date` AS `Day`,
    SUM(DATEDIFF(LEAST(`d`.`date`, `t`.`End_date`), `t`.`Beginning_Date`)   1) AS `Days_Tasks`,
    SUM(IF(`t`.`Status` = 'Completed', DATEDIFF(LEAST(`d`.`date`, `t`.`End_date`), `t`.`Beginning_Date`)   1, 0)) AS `Days_Tasks_Completed`
FROM `dates` `d`
INNER JOIN `Tasks` `t` ON `t`.`Beginning_Date` <= `d`.`date`
GROUP BY `d`.`date`
ORDER BY `d`.`date`;

Here is a db<>fiddle

To better understand what is happening we can look at the result of the JOIN for just Task-1

/* The CTEs are as above but left out for brevity */
SELECT
    `d`.`date` AS `Day`, t.*,
    /*
        we need to add one as MAX Days_Tasks is inclusive
        of Beginning_Date and End_date
    */
    DATEDIFF(`d`.`date`, `t`.`Beginning_Date`)   1 AS `Days_Tasks_1`,
    /*
        we need to use the LEAST of d.date and t.End_date so
        we do not continue to count days beyond the t.End_date
    */
    DATEDIFF(LEAST(`d`.`date`, `t`.`End_date`), `t`.`Beginning_Date`)   1 AS `Days_Tasks_2`
FROM `dates` `d`
INNER JOIN `Tasks` `t` ON `t`.`Beginning_Date` <= `d`.`date`
WHERE `t`.`ID` = 1
ORDER BY `d`.`date` ASC;

which returns -

Day ID Task Beginning_Date End_date Status Days_Tasks_1 Days_Tasks_2
2021-12-01 1 Task-1 2021-12-01 2021-12-06 Pendent 1 1
2021-12-02 1 Task-1 2021-12-01 2021-12-06 Pendent 2 2
2021-12-03 1 Task-1 2021-12-01 2021-12-06 Pendent 3 3
2021-12-04 1 Task-1 2021-12-01 2021-12-06 Pendent 4 4
2021-12-05 1 Task-1 2021-12-01 2021-12-06 Pendent 5 5
2021-12-06 1 Task-1 2021-12-01 2021-12-06 Pendent 6 6
2021-12-07 1 Task-1 2021-12-01 2021-12-06 Pendent 7 6

Days_Tasks_1 looks fine until we get to the final row where it has continued increasing even though Task-1 has ended. To correct this we need to use the LEAST of Day and End_date so we do not continue beyond the End_date of the task. See Days_Tasks_2

Now if we look at the JOIN result for just Day 2021-12-07

/* CTEs omitted as above */
SELECT
    `d`.`date` AS `Day`, t.*,
    DATEDIFF(LEAST(`d`.`date`, `t`.`End_date`), `t`.`Beginning_Date`)   1 AS `Days_Tasks`,
    IF(`t`.`Status` = 'Completed', DATEDIFF(LEAST(`d`.`date`, `t`.`End_date`), `t`.`Beginning_Date`)   1, 0) AS `Days_Tasks_Completed`
FROM `dates` `d`
INNER JOIN `Tasks` `t` ON `t`.`Beginning_Date` <= `d`.`date`
WHERE `d`.`date` = '2021-12-07'
ORDER BY `t`.`ID` ASC;

which returns -

Day ID Task Beginning_Date End_date Status Days_Tasks Days_Tasks_Completed
2021-12-07 1 Task-1 2021-12-01 2021-12-06 Pendent 6 0
2021-12-07 2 Task-2 2021-12-02 2021-12-07 Completed 6 6
2021-12-07 3 Task-3 2021-12-03 2021-12-07 Completed 5 5

Here you can see the counts as described in Observation: column of the example. With the grouping on Day and SUM of Days_Tasks and Days_Tasks_Completed you can see the result desired.

# Day Days_Tasks Days_Tasks_Completed
2021-12-01 1 0
2021-12-02 3 1
2021-12-03 6 3
2021-12-04 9 5
2021-12-05 12 7
2021-12-06 15 9
2021-12-07 17 11
  • Related