In Oracle 12.1, the following query...
select convert(N'è','US7ASCII','EE8MSWIN1250') from dual;
Returns "c". I was hoping for an "e".
Why is this? I am not well-versed in character set issues.
CodePudding user response:
N'è'
is 0x00E8 in UTF-16, so those bytes are your raw input to the CONVERT function. The source charset is EE8MSWIN1250, so Oracle assumes that you want 0xE8 in that codepage.
In the EE8MSWIN1250 codepage, 0xE8 maps to č; if you convert that to ASCII, the closest match is "c".
Similar with your other examples - ï = 0xEF, which is ď in EE8MSWIN1250.
Honestly, I'm surprised that these returned anything - an awful lot of the charset conversions I try with CONVERT just don't work at all.
CodePudding user response:
I don't know why. But, documentation says
For complete correspondence in character conversion, it is essential that the destination character set contains a representation of all the characters defined in the source character set. Where a character does not exist in the destination character set, a replacement character appears
SQL> select convert(N'è','US7ASCII','EE8MSWIN1250') from dual;
---- -------- ------------
C char destination source charset
-
c
Apparently, US7ASCII doesn't contain all characters from EE8MSWIN1250 character set; does it? Replacement character for 'è' (when those two character sets are involved) seems to be c
, so that's what you got.