Python's built-in functions ord
and chr
return UNICODE characters, which are based on 16-bit numbers, and numbers respectively. Is there a way to get extended ASCII characters (0-255), which are based on 8-bit unsigned numbers, without defining a dictionary to do so?
I can use Unicode characters, but its control characters are not as same as ASCII (specifically 10, 13, 26, and 255). These four numbers are the main reason I can't use Unicode because they are important to my code.
10: Line Feed
13: Carriage Return
26: Substitute
255: nbsp
CodePudding user response:
Python3 has the bytes
type, which has a decode
method to convert each byte to the corresponding Unicode character. But since there are about a thousand difference byte-oriented character sets, you'll have to tell decode
which one you're using.
The Python 3 default is 'utf-8'
, but the comment "255 is a control code" tells me that you're not using UTF-8. Neither I nor Python have a crystal ball, so you'll need to figure out the name of the character set you're using.
[edit]
The "nbsp" probably means Non-Breaking Space, U 00A0. If that's encoded as 255, you're probably dealing with some flavor of DOS code page. .decode('cp850')
may work, but as I said there are thousands of extensions, and guessing the wrong name will give weird Unicode output.