there are characters like ''
that are not visible so I cant copy paste it. I want to convert any character to its codepoint like '\u200D'
another example is: 'abc' => '\u0061\u0062\u0063'
CodePudding user response:
Allow me to rephrase your question. The header convert a string to its codepoint in python clearly did not get through to everyone, mostly, I think, because we can't imagine what you want it for.
What you want is a string containing a representation of Unicode escapes.
You can do that this way:
print(''.join("\\u{:04x}".format(b) for b in b'abc'))
\u0061\u0062\u0063
If you display that printed value as a string literal you will see doubled backslashes, because backslashes have to be escaped in a Python string. So it will look like this:
'\\u0061\\u0062\\u0063'
The reason for that is that if you simply put unescaped backslashes in your string literal, like this:
a = "\u0061\u0062\u0063"
when you display a
at the prompt you will get:
>>> a
'abc'
CodePudding user response:
'\u0061\u0062\u0063'.encode('utf-8')
will encode the text to Unicode.
Edit:
Since python automatically converts the string to Unicode you can't see the value but you can create a function that will generate that.
def get_string_unicode(string_to_convert):
res = ''
for letter in string_to_convert:
res = '\\u' (hex(ord(letter))[2:]).zfill(4)
return res
Result:
>>> get_string_unicode('abc')
'\\u0061\\u0062\\u0063'