Back Story
I have an employee table where each employee has one or more work role(s) assigned to them, and for each assigned role it shows the year that it was assigned. I have a query that builds a matrix table, with one row per employee, like so:
name | role1 | role2 | role3 | role4 | role5 |
---|---|---|---|---|---|
Bunny, Bugs | 2022 | ||||
Coyote, Wiley | 2018 | 2018 | |||
Pig, Porky | 2017 | ||||
Mouse, Mickey | 2020 | ||||
Panther, Pink | 2019 | ||||
Cheese, Chuckey | 2021 | 2017 | |||
Duck, Donald | 2021 | 2021 | |||
Devil, Taz | 2019 | ||||
Brown, Charlie | 2021 | 2019 | |||
Flintstone, Fred | 2019 | 2011 | 2016 |
Summary Columns
I need to have some additional columns that compute the following for each employee:
- Number of roles assigned to the employee (COUNT?)
- The year of the oldest role assigned (LEAST)
- The year of the most recent role assigned (GREATEST)
Desired Results
This table illustrates the desired results of my analysis:
name | role1 | role2 | role3 | role4 | role5 | role_count | role_oldest | role_newest |
---|---|---|---|---|---|---|---|---|
Bunny, Bugs | 2022 | 1 | 2022 | 2022 | ||||
Coyote, Wiley | 2018 | 2015 | 2 | 2015 | 2018 | |||
Pig, Porky | 2017 | 1 | 2017 | 2017 | ||||
Mouse, Mickey | 2020 | 1 | 2020 | 2020 | ||||
Panther, Pink | 2019 | 1 | 2019 | 2019 | ||||
Cheese, Chuckey | 2021 | 2017 | 2 | 2017 | 2021 | |||
Duck, Donald | 2021 | 2021 | 2 | 2021 | 2021 | |||
Devil, Taz | 2019 | 1 | 2019 | 2019 | ||||
Brown, Charlie | 2021 | 2019 | 2 | 2019 | 2021 | |||
Flintstone, Fred | 2019 | 2011 | 2016 | 3 | 2016 | 2019 |
My Mixed Results
For the role_oldest
and role_newest
columns, I was able to use the following statements:
UPDATE employee_roles
SET role_oldest = LEAST(role1, role2, role3, role4, role5);
UPDATE employee_roles
SET role_newest = GREATEST(role1, role2, role3, role4, role5);
However, in iterative attempts at computing the role_count, I unsuccessfully tried various combinations of the COUNT
, ARRAY
, UNNEST
and STRING_TO_ARRAY
functions, like so:
UPDATE employee_roles
SET role_count = COUNT(role1, role2, role3, role4, role5);
UPDATE employee_roles
SET role_count = COUNT(UNNEST(role1, role2, role3, role4, role5));
UPDATE employee_roles
SET role_count = COUNT(UNNEST(ARRAY(role1, role2, role3, role4, role5)));
UPDATE employee_roles
SET role_count = COUNT(UNNEST(ARRAY(STRING_TO_ARRAY(role1, role2, role3, role4, role5))));
Climbing Out of the Rabbit Hole
It is apparent that I am going the wrong way in this exercise, so I am pulling back out and seeking your help in this. I am sure there is indeed an elegant solution for this, and it is right under my nose, and I hope someone can help me find it.
Similar Questions with Complicated Answers
I have found numerous other similar posts here on SO, and every one of them include crazy solutions that require lots of PSQL gymnastics, and I find it hard to believe that there is not a simple function that does this.
Code for Creating Sample Table
The following code blocks will help quickly create the sample table for this exercise:
create table employee_roles
(
name text,
role1 integer default null,
role2 integer default null,
role3 integer default null,
role4 integer default null,
role5 integer default null,
role6 integer default null,
role_count integer default null,
role_oldest integer default null,
role_newest integer default null
);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Bunny, Bugs',null,null,null,null,2022);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Coyote, Wiley',2018,null,null,2018,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Pig, Porky',null,null,2017,null,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Mouse, Mickey',null,null,null,2020,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Panther, Pink',2019,null,null,null,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Cheese, Chuckey',null,null,2021,2017,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Duck, Donald',2021,null,null,2021,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Devil, Taz',null,2019,null,null,null);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Brown, Charlie',null,null,2021,null,2019);
insert into employee_roles (name,role1,role2,role3,role4,role5)
values ('Flintstone, Fred',2019,null,2011,2016,null);
CodePudding user response:
I'm not sure which version you're using (can you update?) -- in v. 14, it doesn't look like GREATEST()
and LEAST()
behave in the way you described.
You can try the following to get the COUNT
output you're looking for:
edb=# select name, array_length(array_remove(array[role1,role2,role3,role4,role5],null),1) from employee_roles ;
name | array_length
------------------ --------------
Bunny, Bugs | 1
Coyote, Wiley | 2
Pig, Porky | 1
Mouse, Mickey | 1
Panther, Pink | 1
Cheese, Chuckey | 2
Duck, Donald | 2
Devil, Taz | 1
Brown, Charlie | 2
Flintstone, Fred | 3
(10 rows)