Home > Mobile >  Update table columns based on values in other table
Update table columns based on values in other table

Time:10-27

I have two tables

                 showroom
============================================
model_id    car_name         is_available
--------------------------------------------
   0        rav4              false
   1        tacoma            false
   2        corolla           false
   3        tundra            false

and

      warehouse
========================
model_id    car_name         
------------------------
   0        rav4              
   1        tacoma   

The primary key is model_id. I want to update the column showroom.is_available to true if the model_id is present in warehouse. So the output according to the above example would be

                  showroom
    ============================================
    model_id    car_name         is_available
    --------------------------------------------
       0        rav4              true
       1        tacoma            true
       2        corolla           false
       3        tundra            false

I know I can do

UPDATE showroom
SET is_available = true
WHERE model_id IN (
SELECT model_id
FROM warehouse
);

But both tables have over a billion rows, and even though I am using Spark SQL the queries are very slow

CodePudding user response:

SELECT
  a.model_id
  ,a.car_name
  ,CASE WHEN b.model_id IS NOT NULL THEN TRUE ELSE FALSE END is_available
FROM
  showroom a
  LEFT OUTER JOIN warehouse b ON (a.model_id = b.model_id)

CodePudding user response:

On big tables a join is faster than the IN clause

And see if you can#t reduce the number

For showroom you can use a combined index on (model_id,is_available)

When you are able also reduce the number of possible candidates

UPDATE showroom s
INNER JOIN warehouse w ON s.model_id = w.model_id
SET s.is_available = true
WHERE s.is_available = FALSE
  • Related