Let me begin by apologizing for what may have been a confusing title. I an just beginning my data analyst journey. I am working in BIGQUERY with a Extreme Storm dataset (TABLE1) that has fields for LAT,LONG, and STATE. There are null values in the latitude and longitude fields that I want to replace with general LAT/LONG values from a State Information dataset(TABLE2) also containing LAT,LONG and STATE values. In TABLE1 each record is given a unique EVENT_ID and there are 1.4m rows. In TABLE2 each STATE is a unique record.
I've tried:
Update TABLE1
SET TABLE1.BEGIN_LAT=TABLE2.latitude
From TABLE1
INNER JOIN TABLE2
ON TABLE1.STATE = TABLE2.STATE
WHERE TABLE1.BEGIN_LAT IS NULL
I am getting an error because TABLE1 contains multiple rows with the same STATE and I am trying to use it as my primary key. I know what I am doing wrong but can't figure out how to do it the correct way. Is what I am trying to do possible in BigQuery?
Any help would be appreciated. Even advice on how to ask questions! :)
Thank you.
CodePudding user response:
I believe you have in your query some alias for TABLE1 in Update and for TABLE1 in From. In this case you can add condition to the WHERE clause to also match on EVENT_ID. Like this:
UPDATE TABLE1 TABLE1_U
SET TABLE1_U.BEGIN_LAT=TABLE2.latitude
FROM TABLE1 TABLE1_F
INNER JOIN TABLE2
ON TABLE1_F.STATE = TABLE2.STATE
WHERE TABLE1_U.BEGIN_LAT IS NULL AND TABLE1_U.EVENT_ID = TABLE1_F.EVENT_ID
Also, I would prefer to do SELECT query instead of update and save query results to the new table.