I'm working with PostgreSQL 13. I have two tables like this:
permission_table
name | permission |
---|---|
Ann | Read Invoice |
Ann | Write Invoice |
Ann | Execute Payments |
Bob | Read Staff data |
Bob | Modify Staff data |
Bob | Execute Payroll |
Carl | Read Invoice |
Carl | Write Invoice |
risk_table
risk_id | permission |
---|---|
Risk1 | Read Invoice |
Risk1 | Write Invoice |
Risk1 | Execute Payments |
Risk2 | Read Staff data |
Risk2 | Modify Staff data |
Risk2 | Execute Payroll |
I'd like to create a new table containing the names of the employees of the first table whose permissions are pointed as risks in the second table. After the execution, the results should be like this:
name | risk_id |
---|---|
Ann | Risk1 |
Bob | Risk2 |
Since Carl only has two of the three permissions belonging to Risk2, he will not be included in the results.
My first brute force approach was to compare the list of permissions belonging to a risk to the permissions belonging to an employee. If the first list is included in the second one, then that combination of employee/risk will be added to the results table.
INSERT INTO results_table
SELECT a.employee, b.risk_id FROM permission_table a, risk_table b WHERE
((SELECT permission FROM risk_table c WHERE b.permission = c.permission ) EXCEPT
(SELECT permission FROM permission_table d WHERE a.employee=d.employee)
) IS NULL;
I'm not sure if the results could be correct using this approach, because if the tables are big, it takes a very long time even if I add a WHERE clause limiting the query to just one employee.
Could you please help?
CodePudding user response:
One way of approaching this one is by
- computing the amount of permissions for each "risk_id" value
- joining the "permissions" and "risks" table with counts on matching "permission" values
- making sure that the distinct count of permissions for each triplet "<permissions.name, risks.risk_id, risks.cnt>" corresponds to the full amount of permissions.
WITH risks_with_counts AS (
SELECT *, COUNT(permission) OVER(PARTITION BY risk_id) AS cnt
FROM risks
)
SELECT p.name, r.risk_id
FROM permissions p
INNER JOIN risks_with_counts r
ON p.permission = r.permission
GROUP BY p.name, r.risk_id, r.cnt
HAVING COUNT(DISTINCT r.permission) = r.cnt
Carl won't be included in the output as he doesn't have all permissions from "risk_id = 'Risk 1'"
Check the demo here.