I have a string with unicode characters that I need to decode. When I hardcode the string into python it seems to work. However, if I get it through an input, it doesn't translate. For example,
input_0 = input() #f\u00eate
print(input_0) # prints f\u00eate
word = "f\u00eate"
print(word) # prints fête
How could I turn the Unicode parts of the string from the input into regular characters? I have tried using str(word) too.
CodePudding user response:
What you get from input()
is a raw-string which means you don't have escape sequence they are literal characters. \u00ea
is 6 characters.
You should encode it with "raw-unicode-escape"
and then decode it with "unicode-escape"
:
input_0 = input() # f\u00eate
print(input_0.encode("raw-unicode-escape").decode("unicode-escape"))
Explanation for these two encodings: https://docs.python.org/3/library/codecs.html#text-encodings