I would like to create tables which are shared in Big Query UI with suffix of the form _YYYYMMDD through a loop.
My final tables would then look as
tablename_20200801
tablename_20200901
tablename_20201001
...
Whilst I have found many ways to read in all the sharded tables. I cannot seem to find a way to create these sharded tables using a loop.
DECLARE -- change dates to match accordingly
v_snapshot_date date;
DECLARE
end_of_month date;
DECLARE
final_snapshot_date DEFAULT DATE '2020-10-01';
DECLARE
initial_snapshot_date date DEFAULT DATE '2020-08-01';
LOOP
SET
v_snapshot_date = DATE_ADD(v_snapshot_date,INTERVAL 1 month);
SET
end_of_month = LAST_DAY(v_snapshot_date);
IF
v_snapshot_date > final_snapshot_date THEN
LEAVE; -- kill
END IF;
-- create some variables, through a temporary table called tablename.
-- this is the table we want to shard with v_snapshot_date and save
EXECUTE IMMEDIATE
"CREATE OR REPLACE TEMP TABLE tablename (title STRING, publish_date INT64)";
EXECUTE IMMEDIATE -- Add a row with title=Hamlet and with date=1599
"INSERT INTO tablename (title, publish_date) VALUES('Hamlet', 1599)";
END LOOP;
CodePudding user response:
Try something like this:
DECLARE -- change dates to match accordingly
v_snapshot_date date;
DECLARE
end_of_month date;
DECLARE
final_snapshot_date DEFAULT DATE '2020-10-01';
DECLARE
initial_snapshot_date date DEFAULT DATE '2020-08-01';
DECLARE
formatted_date STRING;
SET v_snapshot_date = initial_snapshot_date;
LOOP
SET
v_snapshot_date = DATE_ADD(v_snapshot_date,INTERVAL 1 month);
SET
formatted_date = FORMAT_DATE("%Y%m%d", v_snapshot_date);
SET
end_of_month = LAST_DAY(v_snapshot_date);
IF
v_snapshot_date > final_snapshot_date THEN
LEAVE; -- kill
END IF;
-- create some variables, through a temporary table called tablename.
-- this is the table we want to shard with v_snapshot_date and save
EXECUTE IMMEDIATE
FORMAT("CREATE OR REPLACE TABLE `elzagales.so_test.tablename_%s` (title STRING, publish_date INT64)", formatted_date);
EXECUTE IMMEDIATE -- Add a row with title=Hamlet and with date=1599
FORMAT("INSERT INTO `elzagales.so_test.tablename_%s` (title, publish_date) VALUES('Hamlet', 1599)", formatted_date);
END LOOP;
I've modified your original code with the following:
- defining of a string to use as the table suffix
- setting of the v_snapshot_date to the initial_snapshot_date
- formatting of the DDL and DML statements
As a note, you may want to set your initial_snapshot_date one date further back because of the incrementing of the value by one month at the start of the loop.
Alternatively you could use a FOR LOOP which may be simpler:
DECLARE end_of_month date;
DECLARE final_snapshot_date DEFAULT DATE '2020-10-01';
DECLARE initial_snapshot_date date DEFAULT DATE '2020-08-01';
DECLARE formatted_date STRING;
FOR date_suffix in (
select * from UNNEST(GENERATE_DATE_ARRAY(initial_snapshot_date, final_snapshot_date, INTERVAL 1 MONTH)) date
)
DO
SET formatted_date = FORMAT_DATE("%Y%m%d", date_suffix.date);
SET end_of_month = LAST_DAY(date_suffix.date);
IF
date_suffix.date > final_snapshot_date THEN
LEAVE; -- kill
END IF;
-- create some variables, through a temporary table called tablename.
-- this is the table we want to shard with v_snapshot_date and save
EXECUTE IMMEDIATE
FORMAT("CREATE OR REPLACE TABLE `project_id.dataset_id.tablename_%s` (title STRING, publish_date INT64)", formatted_date);
EXECUTE IMMEDIATE -- Add a row with title=Hamlet and with date=1599
FORMAT("INSERT INTO `project_id.dataset_id.tablename_%s` (title, publish_date) VALUES('Hamlet', 1599)", formatted_date);
END FOR;