What is wrong with my SQL query in snowflake? Error message: "missing 'SET' at '-CodePudding

I am executing this query from a python connection to snowflake and using pyspark like so:

update = """
UPDATE FINAL_TBL x
FROM STG_TBL y
SET x.NAME = y.NAME
WHERE y.customer_id = x.customer_id AND y.end_date = x.end_date
"""

Then I execute this through a pyspark connection to sf like so:

spark.sql(update)

When I hit run I get the following error:

missing 'SET' at 'FROM' (line 3, pos 0)

== SQL ==

UPDATE FINAL_TBL x
^^^
SET x.NAME = y.NAME
WHERE y.customer_id = x.customer_id AND y.end_date = x.end_date

Can someone show what is incorrect about this query?

Thank you

I've tried re-writing the query substituting "USING" instead of "FROM". Extensive google searches dont lead to conclusive results. When I ran these through SQL validations I got back similar errors.

CodePudding user response：

Try to think of the UPDATE command when you are joining to a secondary (or even multiple tables) as a SELECT FIRST. Once you get that query working, change to an update. Ex from your scenario

select
      ft.name,
      st.name
   from
      Final_Tbl ft
         JOIN stg_tbl st
            on ft.customer_id = st.customer_id
           AND ft.end_date = st.end_date

Once this is confirmed, you should be able to see the name column from each respective table, so you know you have the join and aliases correct and can see the names. Now simply change to update/set.

update ft set
      name = st.name
   from
      Final_Tbl ft
         JOIN stg_tbl st
            on ft.customer_id = st.customer_id
           AND ft.end_date = st.end_date

CodePudding user response：

You have the SET and FROM clauses out of order:

UPDATE <target_table> SET <col_name> = [ , <col_name> = , ... ] [ FROM <additional_tables> ] [ WHERE ]

Try:

update = """
UPDATE FINAL_TBL x
SET "NAME" = y."NAME"
FROM STG_TBL y
WHERE y.customer_id = x.customer_id AND y.end_date = x.end_date
"""