Home > Back-end >  How to use the NOT IN operator (<>) in a GROUP_CONCAT with dates?
How to use the NOT IN operator (<>) in a GROUP_CONCAT with dates?

Time:02-25

I have the calendar table with all the dates of the month of December 2021 (I will only exemplify some dates within the table, but it is understood that it actually contains all the days of said month):

ID date
01 2021-12-01
02 2021-12-02
03 2021-12-03
04 2021-12-04
05 2021-12-05

I have the users table:

ID name num_employee
01 Andrew 101
02 Mary 102

I have the table assistances

ID date num_employee
01 2021-12-03 101
02 2021-12-04 101
03 2021-12-03 102
04 2021-12-04 102
05 2021-12-05 101
06 2021-12-06 102

I worked on a query to display the employee number, their name, the days they attended and the days they were absent:

SELECT u.num_employee,
       u.name,
       a.date AS attendances,
       c.date as faults FROM users u
JOIN (SELECT num_employee,
      GROUP_CONCAT(DISTINCT EXTRACT(DAY FROM date)) AS date FROM attendances
      WHERE date BETWEEN '2021-12-01' AND '2021-12-31'
      GROUP BY num_employee) a ON a.not_employee = u.num_employee
LEFT JOIN (SELECT GROUP_CONCAT(DISTINCT EXTRACT(DAY FROM date)) AS date FROM calendar
           WHERE date BETWEEN '2021-12-01' AND '2021-12-31') c ON c.date <> a.date

With the above query, I get this:

num_employee name assists faults
101 Andrew 3,4,5 1,2,3,4,5,6,7,8,9,10...
102 Mary 3,4,6 1,2,3,4,5,6,7,8,9,10...

In the attendance column I obtain the days of the month of December in which each employee had attendance, and in the faults I should only obtain the days in which there were absences, but all the days of the month of December are displayed.

I am almost sure that the problem is in how I evaluate that the numbers of the days displayed in the attends column are not displayed in the absences column. Specifically in this part I consider that my evaluation is wrong:

ON c.date <> a.date

I'm under the impression that since I'm working with GROUP_CONCAT, I should evaluate dates differently. How could I adapt my query to get the following?

not_employee name attendances faults
101 Andrew 3,4,5 1,2,3,6,7,8,9,10...
102 Mary 3,4,6 1,2,5,7,8,9,10...

The query in question cannot be adapted to use CTE given the version of MariaDB I am using. I am working on phpMyAdmin.

CodePudding user response:

One solution is a subselect.

This works also in mysql 5, with mysql 8 you could make a CTE from attendense.

CREATE TABLE calendar (
  `ID` INTEGER,
  `date` VARCHAR(10)
);

INSERT INTO calendar
  (`ID`, `date`)
VALUES
  ('01', '2021-12-01'),
  ('02', '2021-12-02'),
  ('03', '2021-12-03'),
  ('04', '2021-12-04'),
  ('05', '2021-12-05'),
  ('06', '2021-12-06'),
  ('07', '2021-12-07'),
  ('08', '2021-12-08'),
  ('09', '2021-12-09'),
  ('10', '2021-12-10'),
  ('11', '2021-12-11');
CREATE TABLE users (
  `ID` INTEGER,
  `name` VARCHAR(6),
  `num_employee` INTEGER
);

INSERT INTO users
  (`ID`, `name`, `num_employee`)
VALUES
  ('01', 'Andrew', '101'),
  ('02', 'Mary', '102');
CREATE TABLE attendances (
  `ID` INTEGER,
  `date` VARCHAR(10),
  `num_employee` INTEGER
);

INSERT INTO attendances
  (`ID`, `date`, `num_employee`)
VALUES
  ('01', '2021-12-03', '101'),
  ('02', '2021-12-04', '101'),
  ('03', '2021-12-03', '102'),
  ('04', '2021-12-04', '102'),
  ('05', '2021-12-05', '101'),
  ('06', '2021-12-06', '102');
SELECT u.num_employee,
       u.name,
       a.date AS attendances,
        (SELECT GROUP_CONCAT(DISTINCT EXTRACT(DAY FROM date)) AS date FROM calendar
           WHERE date BETWEEN '2021-12-01' AND '2021-12-31'
           AND NOT FIND_IN_SET(EXTRACT(DAY FROM date),a.date)) as faults FROM users u
JOIN (SELECT num_employee,
      GROUP_CONCAT(DISTINCT EXTRACT(DAY FROM date)) AS date FROM attendances
      WHERE date BETWEEN '2021-12-01' AND '2021-12-31'
      GROUP BY num_employee) a ON a.num_employee = u.num_employee
num_employee | name   | attendances | faults           
-----------: | :----- | :---------- | :----------------
         101 | Andrew | 3,4,5       | 1,2,6,7,8,9,10,11
         102 | Mary   | 3,4,6       | 1,2,5,7,8,9,10,11

db<>fiddle here

CodePudding user response:

Consider a cross join of the users and calendar tables for all possible pairwise matches of employees and dates. Then left join to assistance and run aggregate GROUP_CONCAT with one conditional expression for faults:

WITH cte AS (
   SELECT u.num_employee,
          u.name,
          c.date
   FROM calendar c
   CROSS JOIN users u
   WHERE c.date BETWEEN '2021-12-01' AND '2021-12-31'
)

SELECT cte.num_employee,
       cte.name,
       GROUP_CONCAT(DISTINCT EXTRACT(DAY FROM a.date)) AS attendances,
       GROUP_CONCAT(DISTINCT 
           IF(a.date IS NULL, EXTRACT(DAY FROM cte.date), NULL)
       ) AS faults 
FROM cte
LEFT JOIN attendances a
   ON cte.date = a.date
   AND cte.num_employee = a.num_employee
   AND a.date BETWEEN '2021-12-01' AND '2021-12-31'
GROUP BY cte.num_employee,
         cte.name;

Online Demo

  • Related