When defining a multi-column cluster key in Snowflake, is it necessary to include parentheses in the cluster-key expression? Or are comma separated values sufficient?
To demonstrate, I am sure the first expression is correct (with parentheses). I am not sure if the second one is as well.
-- With parentheses
ALTER TABLE my_snowflake_table CLUSTER BY (column_1, column_2);
-- Without parentheses
ALTER TABLE my_snowflake_table CLUSTER BY column_1, column_2;
CodePudding user response:
You have to use parenthesis for even a single column or you'll get a SQL compile error:
create or replace table foo (x int, y int);
-- Compile error
alter table foo cluster by x;
-- Works
alter table foo cluster by (x, y);
Before defining a clustering key on two columns, be sure that the cardinality of your key is reasonable, generally you want it to be lower than the number of micro-partitions on your table.
You can find the number of micro-partitions on the table by running
select system$clustering_information('table_name', '(col1, col2)');