Home > Blockchain >  Adding a SUM group by into a COUNT DISTINCT query
Adding a SUM group by into a COUNT DISTINCT query

Time:11-14

I have the following SQL query that count the distinct quizzes where a tag is below a certain threshold.

SELECT
  COUNT(DISTINCT "quizzes"."id")
FROM
  "quizzes"
  INNER JOIN "sessions" ON "sessions"."id" = "quizzes"."session_id"
  INNER JOIN "subscriptions" ON "subscriptions"."id" = "sessions"."subscription_id"
  LEFT OUTER JOIN "quiz_answers" ON "quiz_answers"."quiz_id" = "quizzes"."id"
  LEFT OUTER JOIN "answers" ON "answers"."id" = "quiz_answers"."answer_id"
  LEFT OUTER JOIN "tag_scores" ON "tag_scores"."answer_id" = "answers"."id"
  LEFT OUTER JOIN "tags" ON "tags"."id" = "tag_scores"."tag_id"
WHERE
  (subscriptions.state = 'subscribed')
  AND (
    tags.id = 56
    and score <= 10
  );

The score is a sum of the same tag present under a same quiz.

I want to be able to add in the query something like this:

SUM(tag_scores.score) group by tags.id, quizzes.id AS score

Such that the score is calculated not on the particular tag score but on the sum of the same tags grouped by tags.id and quizzes.id.

The output should be simply the count number of the result.

I'm having a hard time how to add this to the query.

Any idea or hint how to continue?

CodePudding user response:

This should do it:

SELECT count(*)
FROM   quizzes q
JOIN   LATERAL (
   SELECT SUM(ts.score) AS sum_score
   FROM   quiz_answers qa
   JOIN   answers      a  ON a.id = qa.answer_id
   JOIN   tag_scores   ts ON ts.answer_id = a.id
-- JOIN   tags         t  ON t.id = ts.tag_id  -- not needed
   WHERE  qa.quiz_id = q.id
   AND    ts.tag_id = 56
   ) AS score56 ON score56.sum_score <= 10
WHERE  EXISTS (
   SELECT FROM sessions se
   JOIN   subscriptions su ON su.id = se.subscription_id
   WHERE  se.id = q.session_id
   AND    su.state = 'subscribed'
   );

The LATERAL join enforces your desired additional filter. Quizzes are eliminated where ...

  • no tag_id 56 is connected at all (then sum_score is NULL).
  • or the sum of all scores for tag 56 is > 10.

Since we do not multiply rows from the main table quizzes with joins, we also don't need the expensive count(DISTINCT ...). A simple count(*) does it now.

I moved your original filter to an EXISTS expression since I don't know whether those joins can multiply rows. If sessions and subscriptions are in a 1:1 relationship to quizzes, then you can keep your original joins for those.

  • Related