I'm following an online video tutorial on Azure Synapse Analytics. The instructor used the following script:
SELECT TOP 10 *
FROM
OPENROWSET(
BULK 'taxi/raw/trip_type.tsv',
DATA_SOURCE ='nyc_taxidata',
FORMAT='CSV',
PARSER_VERSION='2.0',
HEADER_ROW=TRUE,
FIELDTERMINATOR = '\t'
) AS trip_type;
My question is why does the script work when the format is FORMAT='CSV'
?
Also when I tried to use FORMAT='TSV'
I got the following error message:
Parser version '2.0' is not supported for provided format 'TSV'.
CodePudding user response:
The CSV stands for Comma Separated Values
, which means by default the FIELDTERMINATOR
is Comma. But it doesn't mean we cannot use any other delimiter. The delimiter could be any character like |
, \n
, :
, etc., like in your case it's \t
. Hence, whether it is ,
or any other, the file will only be called CSV and only the DELIMITER
will change.
Coming to second part of your question.
CSV parser version 1.0 is default and feature rich. Version 2.0 is built for performance and does not support all options and encodings.
Use PARSER_VERSION='1.0'
in the 7th line of your code, and your issue will be resolved.
Refer: How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics