Home > Enterprise >  create date sharded table in Big Query UI with suffix of the form _YYYYMMDD using a loop
create date sharded table in Big Query UI with suffix of the form _YYYYMMDD using a loop

Time:09-12

I would like to create tables which are shared in Big Query UI with suffix of the form _YYYYMMDD through a loop.

My final tables would then look as

  • tablename_20200801
  • tablename_20200901
  • tablename_20201001
  • ...

Whilst I have found many ways to read in all the sharded tables. I cannot seem to find a way to create these sharded tables using a loop.

DECLARE -- change dates to match accordingly
  v_snapshot_date date;
DECLARE
  end_of_month date;
DECLARE
  final_snapshot_date DEFAULT DATE '2020-10-01';
DECLARE
  initial_snapshot_date date DEFAULT DATE '2020-08-01';

LOOP
SET
  v_snapshot_date = DATE_ADD(v_snapshot_date,INTERVAL 1 month);
SET
  end_of_month = LAST_DAY(v_snapshot_date);
IF
  v_snapshot_date > final_snapshot_date THEN
LEAVE; -- kill
END IF;
-- create some variables, through a temporary table called tablename.
-- this is the table we want to shard with v_snapshot_date and save
EXECUTE IMMEDIATE
  "CREATE OR REPLACE TEMP TABLE tablename (title STRING, publish_date INT64)";

EXECUTE IMMEDIATE -- Add a row with title=Hamlet and with date=1599
  "INSERT INTO tablename (title, publish_date) VALUES('Hamlet', 1599)";

END LOOP;

CodePudding user response:

Try something like this:

DECLARE -- change dates to match accordingly
  v_snapshot_date date;
DECLARE
  end_of_month date;
DECLARE
  final_snapshot_date DEFAULT DATE '2020-10-01';
DECLARE
  initial_snapshot_date date DEFAULT DATE '2020-08-01';
DECLARE
  formatted_date STRING;
SET v_snapshot_date = initial_snapshot_date;

LOOP
SET
  v_snapshot_date = DATE_ADD(v_snapshot_date,INTERVAL 1 month);
SET
  formatted_date = FORMAT_DATE("%Y%m%d", v_snapshot_date);
SET
  end_of_month = LAST_DAY(v_snapshot_date);
IF
  v_snapshot_date > final_snapshot_date THEN
LEAVE; -- kill
END IF;
-- create some variables, through a temporary table called tablename.
-- this is the table we want to shard with v_snapshot_date and save
EXECUTE IMMEDIATE
  FORMAT("CREATE OR REPLACE TABLE `elzagales.so_test.tablename_%s` (title STRING, publish_date INT64)", formatted_date);

EXECUTE IMMEDIATE -- Add a row with title=Hamlet and with date=1599
  FORMAT("INSERT INTO `elzagales.so_test.tablename_%s` (title, publish_date) VALUES('Hamlet', 1599)", formatted_date);

END LOOP;

I've modified your original code with the following:

  • defining of a string to use as the table suffix
  • setting of the v_snapshot_date to the initial_snapshot_date
  • formatting of the DDL and DML statements

As a note, you may want to set your initial_snapshot_date one date further back because of the incrementing of the value by one month at the start of the loop.

Alternatively you could use a FOR LOOP which may be simpler:

DECLARE end_of_month date;
DECLARE final_snapshot_date DEFAULT DATE '2020-10-01';
DECLARE initial_snapshot_date date DEFAULT DATE '2020-08-01';
DECLARE formatted_date STRING;

FOR date_suffix in (
  select * from UNNEST(GENERATE_DATE_ARRAY(initial_snapshot_date, final_snapshot_date, INTERVAL 1 MONTH)) date
)
DO

  SET formatted_date = FORMAT_DATE("%Y%m%d", date_suffix.date);
  SET end_of_month = LAST_DAY(date_suffix.date);
  IF 
    date_suffix.date > final_snapshot_date THEN
    LEAVE; -- kill
  END IF;
  -- create some variables, through a temporary table called tablename.
  -- this is the table we want to shard with v_snapshot_date and save
  EXECUTE IMMEDIATE
    FORMAT("CREATE OR REPLACE TABLE `project_id.dataset_id.tablename_%s` (title STRING, publish_date INT64)", formatted_date);

  EXECUTE IMMEDIATE -- Add a row with title=Hamlet and with date=1599
    FORMAT("INSERT INTO `project_id.dataset_id.tablename_%s` (title, publish_date) VALUES('Hamlet', 1599)", formatted_date);

END FOR;
  • Related