What is the difference between 0x0a and \n in for the ROWTERMINATOR parameter?-CodePudding

What is the difference between using 0x0a and \n for ROWTERMINATOR when importing data from an external source?

I tried to query data from a JSON file into rows and columns and I got different outcomes. Here is an image of the JSON file:

enter image description here

Here is the code I used:

SELECT TOP 10*
    FROM OPENROWSET(
        BULK 'taxi/raw/payment_type_array.json', 
        DATA_SOURCE='nyc_taxidata',
        FORMAT='CSV',
        PARSER_VERSION='1.0',
        FIELDTERMINATOR='0x0b',
        FIELDQUOTE='0x0b',
        ROWTERMINATOR='\n'
    )
    WITH
    (jsonDoc NVARCHAR(MAX)
    ) AS payment_type
    CROSS APPLY OPENJSOn(jsonDoc)
    WITH(
        payment_type SMALLINT,
        payment_type_desc NVARCHAR(MAX) AS JSON
    );

Here is the outcome:

enter image description here

When I used '0x0a' as the FIELDTERMINATOR I got the following:

enter image description here

CodePudding user response：

From the documentation

Specifying \n as a Row Terminator for Bulk Import

When you specify \n as a row terminator for bulk import, or implicitly use the default row terminator, bcp and the BULK INSERT statement expect a carriage return-line feed combination (CRLF) as the row terminator. If your source file uses a line feed character only (LF) as the row terminator - as is typical in files generated on Unix and Linux computers - use hexadecimal notation to specify the LF row terminator.

So if you are getting different results for \n and 0x0a then you have some lines that are just using a line feed (LF) character, and some which are using carriage return (CR) followed by a line feed (LF) character (CRLF). This is a problem with the source data, in my opinion, as using inconsistent new line characters can (and does) cause problems.

Specifying \n as a Row Terminator for Bulk Import

Specifying `\n` as a Row Terminator for Bulk Import