Home > Back-end >  How do I find entries that contains ALL values of a particular attribute from another table using SQ
How do I find entries that contains ALL values of a particular attribute from another table using SQ

Time:10-28

Sorry if the question was not phrased awkwardly, I didn't know how to put it without an example.

Given I have a 2 tables, Watch and WorkIn:

Watch: {emailaddress, videoID)
WorkIn: {videoID, castname}

How do I find for example, the email address of people who have watched ALL of 'David' products?

I was thinking of using GROUP BY and ALL but I'm rather new to sql querying and don't really know how to put it all together.

enter image description here

CodePudding user response:

Two tables - one with e-mails and videos watched by e-mail owner and the other with videos and cast names worked in them. As I understand the e-mail owners could watch and cast names could participate in multiple videos with cross over possibilities. Something like in this two tables (as sample data):

WITH
    t_watch AS
        (
          Select '[email protected]' "E_MAIL", 101  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 101  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 102  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 102  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 201  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 101  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 201  "VIDEO_ID" From Dual Union All
          Select '[email protected]' "E_MAIL", 301  "VIDEO_ID" From Dual 
        ),
    t_work_in AS
        (
          Select 101 "VIDEO_ID", 'Chriss' "CAST_NAME" From Dual Union All
          Select 101 "VIDEO_ID", 'David'  "CAST_NAME" From Dual Union All
          Select 101 "VIDEO_ID", 'Annie'  "CAST_NAME" From Dual Union All
          Select 102 "VIDEO_ID", 'Chriss' "CAST_NAME" From Dual Union All
          Select 201 "VIDEO_ID", 'Chriss' "CAST_NAME" From Dual Union All
          Select 201 "VIDEO_ID", 'David'  "CAST_NAME" From Dual Union All
          Select 201 "VIDEO_ID", 'Annie'  "CAST_NAME" From Dual Union All
          Select 202 "VIDEO_ID", 'Annie'  "CAST_NAME" From Dual Union All
          Select 301 "VIDEO_ID", 'Robert' "CAST_NAME" From Dual
        ),

The question is about getting the e-mail address of a person watched all the videos participated by specific cast name. To get the answer you should get the lists of:

  • unique ordered video ids watched by particular e-mail owner
  • unique ordered video ids that particular cast name worked in
    To do that create two ctes - cte_watched_by and cte_worked_in:
    cte_worked_in AS
        (
            SELECT DISTINCT
                wk.CAST_NAME, 
                LISTAGG(wk.VIDEO_ID, ', ') WITHIN GROUP (Order By wk.VIDEO_ID) OVER(Partition By wk.CAST_NAME) "WORKED_LIST",
                Count(DISTINCT wk.VIDEO_ID) OVER(Partition By wk.CAST_NAME) "COUNT_IDS_WORKED"
            FROM
                (Select DISTINCT wrk.CAST_NAME, wrk.VIDEO_ID From t_work_in wrk Left Join t_watch wtc ON(wtc.VIDEO_ID = wrk.VIDEO_ID) Order By wrk.VIDEO_ID) wk
        ), 
    cte_watched_by AS
        (
            SELECT DISTINCT
                wb.E_MAIL, 
                LISTAGG(wb.VIDEO_ID, ', ') WITHIN GROUP (Order By wb.VIDEO_ID) OVER(Partition By wb.E_MAIL) "WATCHED_LIST",
                Count(DISTINCT wb.VIDEO_ID) OVER(Partition By wb.E_MAIL) "COUNT_IDS_WATCHED"
            FROM
                (Select DISTINCT wtc.E_MAIL, wtc.VIDEO_ID From t_watch wtc Left Join t_work_in wrk ON(wrk.VIDEO_ID = wtc.VIDEO_ID) Order By wtc.VIDEO_ID) wb
        )  

This is what we have got so far:
cte_worked_in

CAST_NAME WORKED_LIST COUNT_IDS_WORKED
Chriss 101, 102, 201 3
Robert 301 1
Annie 101, 201, 202 3
David w101, 201 2

cte_watched_by

E_MAIL WATCHED_LIST COUNT_IDS_WATCHED
[email protected] 101, 201 2
[email protected] 101, 201 2
[email protected] 102 1
[email protected] 101, 102, 301 3

Now we can join those ctes and get the answer - below is main SQL:

SELECT DISTINCT
    wk.CAST_NAME "WORKED_IN",
    wk.COUNT_IDS_WORKED "COUNT_IDS_WORKED",
    wk.WORKED_LIST,
    wb.E_MAIL "WATCHED_BY",
    wb.COUNT_IDS_WATCHED "COUNT_IDS_WATCHED",
    wb.WATCHED_LIST
FROM
    cte_worked_in wk
INNER JOIN    
    cte_watched_by wb ON
            (
                (wk.COUNT_IDS_WORKED = wb.COUNT_IDS_WATCHED And wk.WORKED_LIST = wb.WATCHED_LIST)
              OR
                (wk.COUNT_IDS_WORKED = 1 And wb.COUNT_IDS_WATCHED > 1 And InStr(wb.WATCHED_LIST, wk.WORKED_LIST) > 0)
              OR
                (wk.COUNT_IDS_WORKED = 2 And wb.COUNT_IDS_WATCHED > 2 And InStr(wb.WATCHED_LIST, SubStr(wk.WORKED_LIST, 1, 3)) > 0 And InStr(wb.WATCHED_LIST, SubStr(wk.WORKED_LIST, 6, 3)) > 0)
            )

... and here is resulting dataset:

WORKED_IN COUNT_IDS_WORKED WORKED_LIST WATCHED_BY COUNT_IDS_WATCHED WATCHED_LIST
Robert 1 301 [email protected] 3 101, 102, 301
David 2 101, 201 [email protected] 2 101, 201
David 2 101, 201 [email protected] 2 101, 201

The ORs within ON expression of INNER JOIN serve to cover situations where somebody watch a lot of movies with cast names that works in just few movies. It is the weakest part of this answer.
The resulting dataset could be filtered (if needed) to show just results of a particular cast name and/or e-mail.
Regards...

CodePudding user response:

Wasn't able to test this but I think this should work:

SELECT emailaddress, count(*)
FROM Watch w inner join WorkIn wi on w.videoID = wi.videoID
WHERE wi.castName = 'Chris Evans'
GROUP BY emailaddress
HAVING count(*) = (select count(*) from WorkIn where wi.castName = 'Chris Evans')

The group by will count for each emailaddress the number of videos seen with Chris Evans.

HAVING is like a WHERE clause on the aggregated result (so after the GROUP BY). Only when this count is equal to the total number of videos in which Chris Evans plays, the emailaddress is returned.

  • Related