I am writing a CETAS(create external table as select) query in TSQL using synapse. As all these CETAS queries generate a .csv file in a location (in storage account/data lake gen2), none of these files have headers.
Is there a possible solution or work-around using tsql or synapse analytics?
The following query is from azure documentation, this is also missing headers
-- use CETAS to export select statement with OPENROWSET result to storage
CREATE EXTERNAL TABLE population_by_year_state
WITH (
LOCATION = 'aggregated_data/',
DATA_SOURCE = population_ds,
FILE_FORMAT = census_file_format
)
AS
SELECT decennialTime, stateName, SUM(population) AS population
FROM
OPENROWSET(BULK 'https://azureopendatastorage.blob.core.windows.net/censusdatacontainer/release/us_popula
tion_county/year=*/*.parquet',
FORMAT='PARQUET') AS [r]
GROUP BY decennialTime, stateName
GO
-- you can query the newly created external table
SELECT * FROM population_by_year_state
In my case I have a JSON and other also result set of previously created external tables.
CodePudding user response:
How is your file format defined? Here is the code we use that generates column headers:
DROP EXTERNAL FILE FORMAT EXT_File_Format_CSV
CREATE EXTERNAL FILE FORMAT EXT_File_Format_CSV
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
PARSER_VERSION = '2.0',
FIRST_ROW = 2
)
);