Home > Mobile >  How do I exit a stored procedure when I encounter a null value in a column?
How do I exit a stored procedure when I encounter a null value in a column?

Time:06-03

I have a source table in BigQuery into which data is ingested from CSV files. The source table has a column named "ingestDate" of type DATETIME. The destination table's structure is exactly the same as the source.

SOURCE TABLE:

ID ingestDate Column1
113 2022-06-02T00:00:00 Value4
112 null Value3
111 2022-06-01T00:00:00 Value2
110 2022-05-31T00:00:00 Value1

DESTINATION TABLE:

ID ingestDate Column1
111 2022-06-01T00:00:00 Value2
110 022-05-31T00:00:00 Value1

I'm need to write a stored procedure to do the following:

  • Copy all the latest values into destination table from the source table. (This is simple enough by comparing the timestamps and I can get this done)
  • If there are any null values in column "ingestDate" within the source table, then throw an error and exit the stored procedure without mapping any data at all.

Here is some additional information:

  • Column "ingestDate" does not have a NOT NULL clause, and there is no way for me to modify the structure of the table to include one.
  • I am also trying to get the procedure to return the number of new rows inserted.

This is the procedure I have so far:

CREATE OR REPLACE PROCEDURE dataset.sample1()
BEGIN

    DECLARE sourceCount, initDestCount, finalDestCount INT64;
    DECLARE latestIngestDate DATETIME;

    SET sourceCount = (SELECT COUNT(*) FROM dataset.sourceTable);
    SET initDestCount = (SELECT COUNT(*) FROM dataset.destTable);
    SET latestIngestDate = (SELECT MAX(ingestDate) FROM dataset.destTable);

    BEGIN
        INSERT INTO dataset.destTable (
            ID, 
            ingestDate, 
            Column1)

        SELECT
            CAST(ID AS INT64),
            CAST(ingestDate AS DATETIME),
            CAST(Column1 AS STRING)

        FROM
            dataset.sourceTable

        WHERE
            ingestDate > latestIngestDate

    END;

SET finalDestCount = (SELECT COUNT(*) FROM dataset.destTable);

SELECT sourceCount, initDestCount, finalDestCount;

END; 

Can someone please help me implement the error mechanism that I was looking to implement in the procedure above? Any help with optimizing this script would also be greatly appreciated.

Thank you!

CodePudding user response:

can you do a count of the nulls and if there is one then throw an exception before doing your insert?

DECLARE
    e_no_date EXCEPTION;
    PRAGMA exception_init( e_no_date, -20001 );
    mycount int;
 
BEGIN
    SELECT Count(*) 
    INTO mycount
    FROM
       dataset.sourceTable
    WHERE
            ingestDate is null;
    

    IF mycount > 0 THEN 
        RAISE e_no_date;
    END IF;

CodePudding user response:

change the where like this

    WHERE
        ingestDate is not null and
        ingestDate > latestIngestDate

https://cloud.google.com/bigquery/docs/reference/standard-sql/operators#is_operators


Based on comment -- I'd add a flag to the return, throwing exceptions is VERY slow and generally considered a poor choice for flow of control.

SET errorFlag = SELECT COUNT(*) FROM dataset.sourceTable where ingestDate is null;

SELECT sourceCount, initDestCount, finalDestCount, errorFlag;
  • Related