I want to create a new table from an existing one, and add one column. But, and this seems to make this tricky, I want it to be partitioned by _PARTITIONTIME.
I know I can create a table from an existing table, like so:
CREATE OR REPLACE TABLE `mydataset.mytable_new`
AS SELECT * FROM `mydataset.mytable`
--JOIN the new column here
LIMIT 0
I also know that I can create a partitioned table, like so:
CREATE OR REPLACE TABLE `mydataset.mytable_new`
(
date DATE,
var1 STRING,
var2 INT64,
--add new column here
)
PARTITION BY DATE(_PARTITIONTIME);
But: How can I combine the two? I tried this:
CREATE OR REPLACE TABLE `mydataset.mytable_new`
PARTITION BY DATE(_PARTITIONTIME)
AS SELECT * FROM `mydataset.mytable`
-- JOIN new column here
LIMIT 0
However, this gives me the error 'Unrecognized name: _PARTITIONTIME'.
Any hints are greatly appreciated!
CodePudding user response:
You may try like :
CREATE TABLE
mydataset.newtable (transaction_id INT64, transaction_date DATE)
PARTITION BY
transaction_date
AS SELECT transaction_id, transaction_date FROM mydataset.mytable
From doc : cloud.google.com/bigquery/docs/creating-partitioned-tables#sql
CodePudding user response:
This is a documented limitation of the partitioning expression of the CREATE TABLE syntax:
_PARTITIONDATE. Partition by ingestion time with daily partitions. This syntax cannot be used with the AS query_statement clause.
I believe you should be able to split the work. Use a statement to create the new table, then issue INSERT statement(s) to populate from the original table.
However, if you're already dealing with a sizable table, you may want to re-consider this partitioning scheme. By default, all the data from the original table would land in a single partition (the current date).