Why cyrillic strings in hexadecimal format differ from cyrillic chars in hexadecimal format?
str := "Э"
fmt.Printf("%x\n", str)
//result d0ad
str := 'Э'
fmt.Printf("%x\n", str)
//result 42d
CodePudding user response:
Printing the hexadecimal representation of a string
prints the hex representation of its bytes, and printing the hexadecimal representation of a rune
prints the hex representation of the number it is an alias to (rune
is an alias to int32
).
And string
s in Go hold the UTF-8 encoded byte sequence of the text. In UTF-8 representation characters (runes) having a numeric code > 127 have multi-byte representation.
The rune
Э
has multi-byte representation in UTF-8 (being [208, 173]
), and it is not the same as the multi-byte representation of the 32-bit integer 1069 = 0x42d
. Integers are represented using two's complement in memory.
Recommended blog post: Strings, bytes, runes and characters in Go