I have a PostgreSQL dataset with employee name, entry id and entry times. These are repeated across several rows depending on how many times an employee clocked in. What I'm trying to do is group consecutive entries for each employee and count the number of entries.
Input
entry_id emp_name entry_time
100 John Doe 18/10/2021
101 Mark Foo 18/10/2021
102 Angie Genie 19/10/2021
103 Angie Genie 19/10/2021
104 Angie Genie 19/10/2021
105 John Doe 20/10/2021
106 John Doe 20/10/2021
107 Angie Genie 21/10/2021
108 Angie Genie 21/10/2021
109 Mark Foo 22/10/2021
110 Mark Foo 22/10/2021
111 Mark Foo 23/10/2021
112 Mark Foo 24/10/2021
Desired Output
emp_name last entry_time no of entries
John Doe 18/10/2021 1
Mark Foo 18/10/2021 1
Angie Genie 19/10/2021 3
John Doe 20/10/2021 2
Angie Genie 21/10/2021 2
Mark Foo 24/10/2021 4
I have tried using gaps and island approach to assign consecutive emp_names into groups to enable me obtain the entry counts but I have not been able to do so. I was hoping to assign groups like below but nothing is working so far.
entry_id emp_name entry_time group no of entries
100 John Doe 18/10/2021 1 1
101 Mark Foo 18/10/2021 2 1
102 Angie Genie 19/10/2021 3 3
103 Angie Genie 19/10/2021 3 3
104 Angie Genie 19/10/2021 3 3
105 John Doe 20/10/2021 4 2
106 John Doe 20/10/2021 4 2
107 Angie Genie 21/10/2021 5 2
108 Angie Genie 21/10/2021 5 2
109 Mark Foo 22/10/2021 6 4
110 Mark Foo 22/10/2021 6 4
111 Mark Foo 23/10/2021 6 4
112 Mark Foo 24/10/2021 6 4
Is this possible in PostgreSQl? Any help is appreciated. Thank you.
CodePudding user response:
You can use the window function LAG()
to check for the previous emp_name
of each row and the window function SUM()
to create the groups of consecutive repeating emp_name
s.
Finally aggregate by each group:
SELECT emp_name,
MAX(entry_time) last_entry_time,
COUNT(*) no_of_entries
FROM (
SELECT *, SUM(flag) OVER (ORDER BY entry_time, entry_id) grp
FROM (
SELECT *, (emp_name <> LAG(emp_name, 1, '') OVER (ORDER BY entry_time, entry_id))::int flag
FROM tablename
) t
) t
GROUP BY grp, emp_name
ORDER BY grp;
See the demo.