In my SQL server I run
SELECT HashBytes('MD5', CONCAT('',convert(nvarchar,313),'@12')) as mycol
and get
0x6CDF89142FA463156CD250AC0906255C
In Oracle I run
select standard_hash(concat('',concat(cast('313' as nvarchar2(255)),'@12')), 'MD5') from dual;
but I get
7C2325A508108AF673FC9D79D9949915
What do I wrong in Oracle. I want to have the same output!
CodePudding user response:
Joel has explained the issue, but as a variation - and because Oracle discourages the use of the CONVERT function - you can use the UTL_I18N
package to do the conversion instead:
standard_hash(utl_i18n.string_to_raw(n'313@12', 'AL16UTF16LE'), 'MD5')
This incorporates @MTO's comment about concatenation and simplifying the value. It also assumes your database's national character set is AL16UTF16, but that seems to be the case from the result you got originally.
You can continue to build up the value if you need to - perhaps the '313'
string is coming from a table? - but the null (empty string) is still pointless and the concatenation operator is easier to use:
select standard_hash(
utl_i18n.string_to_raw(cast('313' as nvarchar2(255)) || '@12', 'AL16UTF16LE'),
'MD5') as result
from dual;
RESULT |
---|
0x6CDF89142FA463156CD250AC0906255C |
CodePudding user response:
The md5 algorithm doesn't understand strings; it only operates on byte arrays. Therefore the exact byte encoding of the string input matters, because even a single bit difference will completely change the hash output.
In this case, both databases already encode strings as UTF-16, but again: the precise details matter here. The specific difference between Oracle and SQL Server is Endian-ness, with SQL Server using Little Endian encoding while Oracle uses Big Endian encoding. You'll need to change the Oracle query so it produces a string with the same Little Endian encoding as SQL Server.
I think this expression should do the job:
standard_hash(convert(concat('',concat(cast('313' as nvarchar2(255)),'@12')),'AL16UTF16LE'), 'MD5')
You can see it work on the sample input here:
However, this stuff can be tricky, and you should test it on a wider variety of possible inputs with a few different character types covering more of what might be included in your live data.