Home > Software engineering >  How to JOIN a 2nd table with condition based on 3rd JOIN table?
How to JOIN a 2nd table with condition based on 3rd JOIN table?

Time:12-13

I need to join 2 tables with the 2nd table with JOIN condition that is based on the 3rd JOIN table.

The problem is, I couldn't get the value from 3rd JOIN table while still joining the 2nd table.

View on DB Fiddle

Table A: users
-------
user_id INT PRIMARY
Table B: orders
-------
order_id INT PRIMARY
user_id INT
invoice_id INT
Table C: invoices
-------
invoice_id INT PRIMARY
invoice_status VARCHAR [voided, paid]

This is what I wish to do:

SELECT 
    A.user_id,
    B.order_id,
    C.invoice_id,
    C.invoice_status
FROM users A
LEFT JOIN orders B 
    ON (B.user_id = A.user_id
        AND C.invoice_status = 'paid')
LEFT JOIN invoices C
    ON (C.invoice_id = B.invoice_id)

The 9th line AND C.invoice_status = 'paid') is referring to table C which has not been joined yet, so this query will cause Error in query (1054): Unknown column 'C.invoice_status' in 'on clause'

The reason I cannot have this condition in WHERE clause is because I still want to return all users records regardless of whether they have any order or 'paid' invoice or not. So adding WHERE invoice_status = 'paid' will NOT return users without any order and users with invoice_status = 'voided'.

Also, when a user has two orders linked to two different invoice records, but 1 invoice's invoice_status is 'paid', while the other is 'voided', I only want to return the 'paid' record. There can be many voided invoices, but only 1 'paid' invoice record.

By the way, this is enforced: sql_mode=only_full_group_by, so I can't do GROUP BY user_id without having some kind of aggregated or condition on order_id & invoice_id fields.


SELECT 
    A.user_id,
    B.order_id,
    C.invoice_id,
    C.invoice_status
FROM users A
LEFT JOIN orders B 
    ON (B.user_id = A.user_id)
LEFT JOIN invoices C
    ON (C.invoice_id = B.invoice_id);
user_id order_id invoice_id invoice_status
1 1 1 voided
1 2 2 paid
2 3 1 voided
3 NULL NULL NULL

My desired result:

user_id order_id invoice_id invoice_status
1 2 2 paid
2 3 1 voided
3 NULL NULL NULL

Each user_id must only return once, with invoice_status = 'paid' being the preferred row when there are multiple related orders.


Appreciate it if anyone has any idea how to achieve this.

Thanks!

CodePudding user response:

Use ROW_NUMBER() function And number the rows according to user_id and sort according to invoice_status then fetch the first row

SELECT user_id,
       order_id,
       invoice_id,
       invoice_status
FROM
  (SELECT 
    A.user_id,
    B.order_id,
    C.invoice_id,
    C.invoice_status,
    ROW_NUMBER() OVER(PARTITION BY A.user_id ORDER BY C.invoice_status) AS num
  FROM users A
  LEFT JOIN orders B 
    ON (B.user_id = A.user_id)
  LEFT JOIN invoices C
    ON (C.invoice_id = B.invoice_id)) t
WHERE num = 1

db<>fiddle

CodePudding user response:

I am able to solve it by using EXISTS thanks to ProGu's comment.

SELECT 
    A.user_id,
    B.order_id,
    C.invoice_id,
    C.invoice_status
FROM users A
LEFT JOIN orders B 
    ON B.user_id = A.user_id
    AND EXISTS( SELECT 
            1
        FROM
            invoices
        WHERE
            invoices.invoice_id = B.invoice_id
                AND invoices.invoice_status = 'paid')
LEFT JOIN invoices C
    ON C.invoice_id = B.invoice_id
user_id order_id invoice_id invoice_status
1 2 2 paid
2 NULL NULL NULL
3 NULL NULL NULL

Thanks!

  • Related