I have 3 tables. User Accounts, IncomingSentences and AnnotatedSentences. Annotators annotate the incoming sentences and tag an intent to it. Then, admin reviews those taggings and makes the corrections on the tagged intent.
DB-Fiddle Playground link: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=00a770173fa0568cce2c482643de1d79
Assuming myself as the admin, I want to pull the error report per annotator.
My tables are as follows:
User Accounts table:
userId | userEmail | userRole |
---|---|---|
1 | [email protected] | editor |
2 | [email protected] | editor |
3 | [email protected] | editor |
4 | [email protected] | admin |
5 | [email protected] | admin |
Incoming Sentences Table
sentenceId | sentence | createdAt |
---|---|---|
1 | sentence1 | 2021-01-01 |
2 | sentence2 | 2021-01-01 |
3 | sentence3 | 2021-01-02 |
4 | sentence4 | 2021-01-02 |
5 | sentence5 | 2021-01-03 |
6 | sentence6 | 2021-01-03 |
7 | sentence7 | 2021-02-01 |
8 | sentence8 | 2021-02-01 |
9 | sentence9 | 2021-02-02 |
10 | sentence10 | 2021-02-02 |
11 | sentence11 | 2021-02-03 |
12 | sentence12 | 2021-02-03 |
Annotated Sentences Table
id | annotatorId | sentenceId | annotatedIntent |
---|---|---|---|
1 | 1 | 1 | intent1 |
2 | 4 | 1 | intent2 |
3 | 2 | 2 | intent4 |
4 | 3 | 4 | intent4 |
5 | 1 | 5 | intent2 |
6 | 3 | 3 | intent3 |
7 | 5 | 3 | intent2 |
8 | 1 | 6 | intent4 |
9 | 4 | 6 | intent1 |
10 | 1 | 7 | intent1 |
11 | 4 | 7 | intent3 |
12 | 3 | 9 | intent3 |
13 | 2 | 10 | intent3 |
14 | 5 | 10 | intent1 |
Expected Output:
I want an output as a table which provides the info about total-sentences-annotated-per-each editor and the total-sentences-corrected-by-admin on top of editor annotated sentences. I don't want to view the admin-tagged-count in the same table. If it comes also, total-admin-corrected should return 0.
|userEmail |totalTagged|totalAdminCorrected|
|---------------|------------|---------------------|
|[email protected]| 4 | 3 |
|[email protected]| 2 | 1 |
|[email protected]| 3 | 1 |
Query I wrote: I've tried my best. You can see that in the DB-Fiddle
My query is not resulting in the expected output. Requesting your help to achieve this.
CodePudding user response:
Because sentence_id
might be reviewed by different users (role), you can try to use subquery (INNER JOIN
between user_accounts
& annotated_sentences
) with window function condition aggregate function, getting count by your logic.
if you don't want to see admin
count information you can use where
filter rows.
SELECT user_email,
count(Total_Tagged) Total_Tagged,
SUM(totalAdmin) totalAdmin
FROM (
SELECT ist.sentence_id,
user_email,
user_role,
count(CASE WHEN a.user_role = 'editor' THEN 1 END) over(partition by ist.sentence_id) count(CASE WHEN a.user_role = 'admin' THEN 1 END) over(partition by ist.sentence_id) Total_Tagged,
count(CASE WHEN a.user_role = 'admin' THEN 1 END) over(partition by ist.sentence_id) totalAdmin
FROM user_accounts a
INNER JOIN annotated_sentences ats ON
a.user_id = ats.annotator_id
INNER JOIN incoming_sentences ist
ON ist.sentence_id = ats.sentence_id
) t1
WHERE user_role = 'editor'
GROUP BY user_email
ORDER BY user_email
CodePudding user response:
Okay, i really rushed this so there might still be an error in the Code, but try something like this:
SELECT
a.user_email,
count(ist) Total_Tagged,
sum(innerTable.edits)
FROM
incoming_sentences ist
JOIN annotated_sentences ats ON
ist.sentence_id = ats.sentence_id
JOIN user_accounts a ON
a.user_id = ats.annotator_id
LEFT JOIN ( SELECT ics.sentence_id, count(anno.id) AS edits FROM annotated_sentences anno
LEFT JOIN user_accounts ua ON
ua.user_id = anno.annotator_id
LEFT JOIN incoming_sentences AS ics ON
ics.sentence_id = anno.sentence_id
WHERE user_role LIKE 'admin'
GROUP BY ics.sentence_id ) AS innerTable
ON innerTable.sentence_id = ist.sentence_id
GROUP BY a.user_email
The inner select should count how many admin-edits there are per post, the outer one then sums up that number for every post a user edited.
CodePudding user response:
My proposal... It worked in my case.
SELECT UserEmail, SUM(EDICount), SUM(ADMCount)
FROM (SELECT UserAccounts.UserEmail, AnnotatedSentences.SentenceID, COUNT(*) AS EDICount
FROM AnnotatedSentences
LEFT JOIN UserAccounts ON UserAccounts.UserID=AnnotatedSentences.AnnotatorID
WHERE UserRole='editor'
GROUP BY UserAccounts.UserEmail, AnnotatedSentences.SentenceID) AS EDI
LEFT JOIN (SELECT AnnotatedSentences.SentenceID, COUNT(*) AS ADMCount
FROM AnnotatedSentences
LEFT JOIN UserAccounts ON UserAccounts.UserID=AnnotatedSentences.AnnotatorID
WHERE UserRole='admin'
GROUP BY AnnotatedSentences.SentenceID) AS ADM ON EDI.SentenceID=ADM.SentenceID
GROUP BY UserEmail
CodePudding user response:
If it is guaranteed that one sentence can only be annotated once and only be reviewed once, then you can simply group by sentence and get the editor and admin. Then you group by editor and count.
select
editor,
count(*) as total_tagged,
count(admin) as total_admin_corrected
from
(
select
max(ua.user_email) filter (where ua.user_role = 'editor') as editor,
max(ua.user_email) filter (where ua.user_role = 'admin') as admin
from annotated_sentences ans
join user_accounts ua on ua.user_id = ans.annotator_id
group by ans.sentence_id
) with_editor_and_admin
group by editor
order by editor;
Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e409ec49af25ac8329a99b02161832fb