Home > database >  Join three tables and retrieve the expected result
Join three tables and retrieve the expected result

Time:02-16

I have 3 tables. User Accounts, IncomingSentences and AnnotatedSentences. Annotators annotate the incoming sentences and tag an intent to it. Then, admin reviews those taggings and makes the corrections on the tagged intent.

DB-Fiddle Playground link: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=00a770173fa0568cce2c482643de1d79

Assuming myself as the admin, I want to pull the error report per annotator.

My tables are as follows:

User Accounts table:

userId userEmail userRole
1 [email protected] editor
2 [email protected] editor
3 [email protected] editor
4 [email protected] admin
5 [email protected] admin

Incoming Sentences Table

sentenceId sentence createdAt
1 sentence1 2021-01-01
2 sentence2 2021-01-01
3 sentence3 2021-01-02
4 sentence4 2021-01-02
5 sentence5 2021-01-03
6 sentence6 2021-01-03
7 sentence7 2021-02-01
8 sentence8 2021-02-01
9 sentence9 2021-02-02
10 sentence10 2021-02-02
11 sentence11 2021-02-03
12 sentence12 2021-02-03

Annotated Sentences Table

id annotatorId sentenceId annotatedIntent
1 1 1 intent1
2 4 1 intent2
3 2 2 intent4
4 3 4 intent4
5 1 5 intent2
6 3 3 intent3
7 5 3 intent2
8 1 6 intent4
9 4 6 intent1
10 1 7 intent1
11 4 7 intent3
12 3 9 intent3
13 2 10 intent3
14 5 10 intent1

Expected Output:

I want an output as a table which provides the info about total-sentences-annotated-per-each editor and the total-sentences-corrected-by-admin on top of editor annotated sentences. I don't want to view the admin-tagged-count in the same table. If it comes also, total-admin-corrected should return 0.

|userEmail     |totalTagged|totalAdminCorrected|
|---------------|------------|---------------------|
|[email protected]|      4     |          3          |
|[email protected]|      2     |          1          |
|[email protected]|      3     |          1          |

Query I wrote: I've tried my best. You can see that in the DB-Fiddle

My query is not resulting in the expected output. Requesting your help to achieve this.

CodePudding user response:

Because sentence_id might be reviewed by different users (role), you can try to use subquery (INNER JOIN between user_accounts & annotated_sentences) with window function condition aggregate function, getting count by your logic.

if you don't want to see admin count information you can use where filter rows.

SELECT user_email,
       count(Total_Tagged) Total_Tagged,
       SUM(totalAdmin) totalAdmin
FROM (
    SELECT ist.sentence_id,
          user_email,
          user_role,
          count(CASE WHEN a.user_role = 'editor' THEN 1 END) over(partition by ist.sentence_id)   count(CASE WHEN a.user_role = 'admin' THEN 1 END) over(partition by ist.sentence_id) Total_Tagged,
          count(CASE WHEN a.user_role = 'admin' THEN 1 END)  over(partition by ist.sentence_id) totalAdmin
    FROM user_accounts a
    INNER JOIN annotated_sentences ats ON 
    a.user_id = ats.annotator_id
    INNER JOIN incoming_sentences ist
    ON ist.sentence_id = ats.sentence_id
) t1
WHERE user_role = 'editor'
GROUP BY user_email
ORDER BY user_email

sqlfiddle

CodePudding user response:

Okay, i really rushed this so there might still be an error in the Code, but try something like this:

SELECT
    a.user_email,
    count(ist) Total_Tagged,
    sum(innerTable.edits)
FROM
    incoming_sentences ist
JOIN annotated_sentences ats ON
    ist.sentence_id = ats.sentence_id
JOIN user_accounts a ON
    a.user_id = ats.annotator_id
    
LEFT JOIN ( SELECT ics.sentence_id, count(anno.id) AS edits FROM annotated_sentences anno
    LEFT JOIN user_accounts ua ON
    ua.user_id = anno.annotator_id
    LEFT JOIN incoming_sentences AS ics ON
    ics.sentence_id = anno.sentence_id
    WHERE user_role LIKE 'admin'
    GROUP BY ics.sentence_id ) AS innerTable
ON innerTable.sentence_id = ist.sentence_id
    
GROUP BY a.user_email

The inner select should count how many admin-edits there are per post, the outer one then sums up that number for every post a user edited.

CodePudding user response:

My proposal... It worked in my case.

SELECT UserEmail, SUM(EDICount), SUM(ADMCount)
FROM (SELECT UserAccounts.UserEmail, AnnotatedSentences.SentenceID, COUNT(*) AS EDICount
        FROM AnnotatedSentences
        LEFT JOIN UserAccounts ON UserAccounts.UserID=AnnotatedSentences.AnnotatorID
        WHERE UserRole='editor'
        GROUP BY UserAccounts.UserEmail, AnnotatedSentences.SentenceID) AS EDI
LEFT JOIN (SELECT AnnotatedSentences.SentenceID, COUNT(*) AS ADMCount
        FROM AnnotatedSentences
        LEFT JOIN UserAccounts ON UserAccounts.UserID=AnnotatedSentences.AnnotatorID
        WHERE UserRole='admin'
        GROUP BY AnnotatedSentences.SentenceID) AS ADM ON EDI.SentenceID=ADM.SentenceID
GROUP BY UserEmail

CodePudding user response:

If it is guaranteed that one sentence can only be annotated once and only be reviewed once, then you can simply group by sentence and get the editor and admin. Then you group by editor and count.

select
  editor,
  count(*) as total_tagged,
  count(admin) as total_admin_corrected
from
(
  select
    max(ua.user_email) filter (where ua.user_role = 'editor') as editor,
    max(ua.user_email) filter (where ua.user_role = 'admin') as admin
  from annotated_sentences ans
  join user_accounts ua on ua.user_id = ans.annotator_id
  group by ans.sentence_id
) with_editor_and_admin
group by editor
order by editor;

Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e409ec49af25ac8329a99b02161832fb

  • Related