Home > Software design >  External table Always insert in a new file, is there any way to write to same file?
External table Always insert in a new file, is there any way to write to same file?

Time:06-10

I have a external table in SQL Server which points to CSV files in folder of Azure blob storage, I enabled polybase export and trying to insert data using insert query. It works but it always creates new file.

Is there any way I can write to single file or give file name while insert?

Here's my table

CREATE EXTERNAL TABLE archive.filetransferauditlog (
    [id] [int]  NULL,
    [STATUS] [varchar](10)  NULL,
    [EVENT] [varchar](10)  NULL,
    [fileNameWithPath] [varchar](2048) NULL,
    [eventStartDate] [datetime] NOT NULL,
    [eventEndDate] [datetime] NOT NULL,
    [description] [varchar](4096) NULL,
    [loggedInUserId] [int] NULL,
    [transferType] [int] NULL
    )
    WITH (
    LOCATION = '/filetransferauditlog/',
    DATA_SOURCE = archivepurgedataExternalDataSource,
    FILE_FORMAT = ParquetFile
    )
GO

Query I am using:

Insert into archive.filetransferauditlog
select Top(5)
from dbo.filetransferauditlog

Please suggest me any way we can give the file name while insert.

When I try to give location for table to a single file instead of directory, I am able to run select query but not insert.

It returns below error:

java.sql.SQLException: Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11" for linked server "SQLNCLI11". CREATE EXTERNAL TABLE AS SELECT statement failed as the path name 'wasbs://[email protected]/filetransferauditlogText/QID5060_20220607_54101_0.txt' could not be used for export. Please ensure that the specified path is a directory which exists or can be created, and that files can be created in that directory.

CodePudding user response:

java.sql.SQLException: Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11" for linked server "SQLNCLI11". CREATE EXTERNAL TABLE AS SELECT statement failed as the path name 'wasbs://[email protected]/filetransferauditlogText/QID5060_20220607_54101_0.txt' could not be used for export. Please ensure that the specified path is a directory which exists or can be created, and that files can be created in that directory.

As per MsDocs, The reason behind getting this error because of PolyBase external table pointed out repeatedly reads all the files, so column or data may be discrepancy

  • Hope you created the external table first and then use INSERT INTO SELECT to export to the external location. While exporting, only data can be exported not a column

Please suggest me any way we can give the file name while insert

  • If you want data in each table consists in single file, use FILE name in the location section by the directory
  • If you want multiple file in table set the files into different directories
  • If you want to create a table on top of csv file just use LOCATION '/warehouse/develop/myfile
  • Make sure you are giving correct path in location attributes are not, give your location along with your file name as below
Create External Table Your table name
(col1 int)WITH (LOCATION=  '/filetransferauditlog/ your file name' ,DATA_SOURCE); OR
Create External Table  Your table name
(col1 int) WITH (LOCATION = '/filetransferauditlog/ ',DATA_SOURCE);

For inserting into same file use Join query while export please check whether you are using correct format as sample below

INSERT  INTO dbo.filetransferauditlog 
SELECT T.* FROM Insured_Customers T1 JOIN CarSensor_Data T2 
ON (T1.CustomerKey = T2.CustomerKey)
WHERE T2.YearMeasured = 2009  and T2.Speed > 40;

For more information in detail, please refer below links:

PolyBase query scenarios

PolyBase errors and possible solutions

CodePudding user response:

Here is the query that you should execute

SELECT *
INTO OUTFILE 'C:\\Donnees\\dev\\table_exp.txt'
FIELDS TERMINATED BY ';' ENCLOSED BY '"'
LINES STARTING BY 'export-table' TERMINATED BY '$\n'
FROM name_of_table;

Note: Into outfile his is the file path fields terminated by:his is the file path enclosed by:the symbol that frames the column values lines starting by this is how your recording in the file will start TERMINATED BY:with which symbol does your recording end?

  • Related