Home > Blockchain >  Matching SQL Server Modulo on a VARBINARY in C#
Matching SQL Server Modulo on a VARBINARY in C#

Time:04-13

Some time back we implemented a slight variation of this Optimizing Protected Indexes approach from SQL Server Central to indexing encrypted data in the ufn_GroupData sample function:

CREATE FUNCTION [dbo].[fncGetBucket]    
(  
    @String NVARCHAR(3500) 
) 
RETURNS INT   
AS
BEGIN  
    DECLARE @Result INT = NULL;  
    IF @String IS NOT NULL
        SET @Result = HASHBYTES('SHA2_256', @String) % 20000;  
    RETURN @Result;   
END

A few million rows are happily stored in the database using the result of this UDF as a "bucketing" index.

Now we would like to be able to generate the matching bucket value in C# and the deceptively simple SQL above doesn't make it clear how to do that.

In fact, I've found it hard to get started. But I can say this code:

byte[] encoded = SHA256.Create().ComputeHash(Encoding.Unicode.GetBytes(str));

generates the same hash as HASHBYTES, so I'm past the first step.

So what SQL is actually doing with that modulo operator???

All help appreciated!

CodePudding user response:

HASHBYTES('SHA2_256') returns a 32 byte ByteArray

What internally is happening is

cast(HASHBYTES('SHA2_256', @String) as int) % 20000

So there's an integer overflow here that is gracefully ignored by SQL Server.
This is undefined behaviour and not easily converted to C#.

You can achieve something similar by taking the last 4 bytes and modulo those.
This seems to work for your mod 20000.

byte[] encoded = SHA256.Create().ComputeHash(Encoding.UTF8.GetBytes("asdf"));

var last4Bytes = encoded.Skip(28).Take(4).Reverse().ToArray();
var myInt = BitConverter.ToInt32(last4Bytes);
var moduloResult = myInt % 20000;

This gives me the same result as

print HASHBYTES('SHA2_256', 'asdf') % 20000

12635

  • Related