We can get the string 你
's unicode code point value:
u'你'.encode('unicode-escape')
b'\\u4f60'
Why the string in unicode form is not equal to its unicode code point value?
u'你' == u'\x4f\x60'
False
u'你' == u'\\u4f60'
False
CodePudding user response:
It is, but your comparison strings are not correct to compare. The first one is two separate characters of a single byte, and the second one has the backslash escaped, meaning that it is the literal 6 characters \u4f60
.
u'你' == u"\u4f60"
True
The encoded byte string has the two backslashes since the encoding escapes it, making it not equivalent even if turned back into a string unless you decode it with unicode-escape
as well.
Side note, the u
is default in python 3.