I'm using Synapse Serverless which does not support materialized views, but some use cases need them for performance reasons. As a workaround I was going to use CETAS statements to persist files and schedule regular updates for them. However, when I try to drop and recreate a CETAS table I get message:
Cannot create external table. External table location already exists. Location provided: [location]
Once the files are created by the CETAS statement it looks like I can't overwrite them with another CETAS statement and I haven't found any other options for modifying files within synapse. I know I can create a larger workflow to delete from storage before rerunning CETAS, but was hoping there was some way to keep things contained to Synapse.
Is there some way to delete/overwrite/modify the files created by a CETAS statement from within Synapse?
CodePudding user response:
As you've found, dropping the table doesn't delete the files. This is outlined in the documentation (see Remarks section) ...
An example is shown below, naturally, it's missing some context but you should get the general idea ...
DECLARE @CurrentTimeStamp varchar(100) = (SELECT REPLACE(REPLACE(REPLACE(CONVERT(varchar, CURRENT_TIMESTAMP, 126), '-', ''), 'T', '_'), ':', ''))
BEGIN TRY
SET @Sql = '
IF EXISTS (SELECT * FROM sys.external_tables WHERE name = ''' @TableName ''' AND schema_id = ' CAST(@SchemaIdReporting as varchar(10)) ')
BEGIN
DROP EXTERNAL TABLE [reporting].[' @TableName ']
END
CREATE EXTERNAL TABLE [reporting].[' @TableName ']
WITH (
LOCATION = ''reporting/' @TableName '/' @CurrentTimeStamp ''',
DATA_SOURCE = [Analytics],
FILE_FORMAT = [Parquet]
) AS
' @Sql
EXEC (@Sql)
END TRY
BEGIN CATCH
SET @ErrorSection = 'Part 1: SQL statement construction.'
SELECT @ErrorMessage = @ErrorSection ', ' ERROR_MESSAGE(), @ErrorSeverity = ERROR_SEVERITY(), @ErrorState = ERROR_STATE()
PRINT @Sql
RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState)
END CATCH
... further to that, modifying (update/insert/delete) the data in an external table is not currently supported and is a limitation. That can be found here ...
The only consideration is a clean up of the old files. I'm yet to do that but like the concept of having a snapshot of all the old data. I haven't needed to go back and access any of it yet but it's there if I need to.