Home > Back-end >  Concatenating SQL Server VARCHAR(N), results with unexpected and inconsistent truncated results when
Concatenating SQL Server VARCHAR(N), results with unexpected and inconsistent truncated results when

Time:10-03

In the following code we see that concatenation of varchar(n) where n>4000, results with unexpected and inconsistent truncated results when converted to nvarchar(max):

  1. When only varchar(n) values are involved in the concatenation the result is truncated to 8000 chars.
  2. A sequence of varchar(n) concatenations, which is done prior to a concatenation to an nvarchar(max), is truncated to 4000 chars, before it is concatenated to the nvarchar(max).
  3. A sequence of varchar(n) concatenations, which is done after a concatenation to an nvarchar(max), each varchar(n) is truncated to 4000 chars before it is concatenated

I guess that the implicit cast is done in order of concatenation, but then I would say that (1) behave as expected, (2) should be truncated to 8000 and (3) should not be truncated at all.

It seems like there is some kind of a (wrong) length cast to 4000 that is done out of the concatenation order which is somehow related to the resulting type NVARCHAR. But then if the intention was to cast all the concatenations according to the resulting type, then, since it is MAXed, no truncates would have been expected (which would give the best result).

Where am I wrong?

DECLARE @x AS NVARCHAR(MAX)
DECLARE @y AS NVARCHAR(MAX) = '_10 chars_'
DECLARE @z AS VARCHAR(4200) = replicate('_', 4200)

SET @x = @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 4200'

SET @x = @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 8400 = 4200 * 2'

SET @x = @z   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 12600 = 4200 * 3'


SET @x = @z   @y
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 4210 = 4200   10'

SET @x = @z   @z   @y
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 8410 = 4200 * 2   10'

SET @x = @z   @z   @z   @y
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 12610 = 4200 * 3   10'


SET @x = @y   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 4210 = 10   4200'

SET @x = @y   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 8410 = 10   4200 * 2'

SET @x = @y   @z   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 12610 = 10   4200 * 3'


SET @x = @z   @z   @z   @y   @z   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 25210 = 4200 * 3   10   4200 * 3'

CodePudding user response:

When you concatenate two varchar values together, the resulting length of the data type is the sum of the two, up to a maximum of 8000, any longer will be truncated.

When you concatenate two nvarchar values together, the resulting length of the data type is the sum of the two, up to a maximum of 4000, any longer will be truncated.

Only when you store it in a max variable does the value become max, the truncation will happen before then

Furthermore, you have a second implicit conversion going on here, from varchar to nvarchar, which itself causes truncation because nvarchar can only be up to 4000.

If you concatenate a nvarchar and a varchar, the varchar is converted to match, but capped at 4000. The final result will not become max until it is stored in the variable. So the implicit conversion will be done after that truncation.

Your results now make perfect sense:

  • The length is already 4200, and is extended to max.
SET @x = @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 4200'

  • In these cases, the intermediate value is capped at 8000, and only then converted to max
SET @x = @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 8400 = 4200 * 2'

SET @x = @z   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 12600 = 4200 * 3'

  • In these cases, an implicit conversion to nvarchar is done first, then concatenated with a max afterwards, so the original @z value was truncated first.
SET @x = @z   @y
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 4210 = 4200   10'

SET @x = @z   @z   @y
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 8410 = 4200 * 2   10'

SET @x = @z   @z   @z   @y
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 12610 = 4200 * 3   10'

SET @x = @y   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 4210 = 10   4200'

SET @x = @y   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 8410 = 10   4200 * 2'

SET @x = @y   @z   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 12610 = 10   4200 * 3'

SET @x = @z   @z   @z   @y   @z   @z   @z
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 25210 = 4200 * 3   10   4200 * 3'

The only way to guarantee the right results is to cast your values to nvarchar(max) first

SET @x = CAST(@z AS nvarchar(max))   CAST(@z AS nvarchar(max))   CAST(@z AS nvarchar(max))   @y   CAST(@z AS nvarchar(max))   CAST(@z AS nvarchar(max))   CAST(@z AS nvarchar(max))
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 25210 = 4200 * 3   10   4200 * 3'

Alternatively, just use CONCAT, which deals with all of this correctly anyway

SET @x = CONCAT(@z, @z, @z, @y, @z, @z, @z);
print CAST(len(@x) AS NVARCHAR(MAX))   '    Expected 25210 = 4200 * 3   10   4200 * 3'

db<>fiddle

  • Related