I'm new to Python and I'm learning coding/encoding, unicode, ascii and so on. I would like to print ASCII characters according to their codes and using chr() function.
def table_ascii():
"procédure imprimant une table des caractères ascii avec leur valeurs"
i = 127
while i < 258:
print(f"{i} -> {chr(i)}")
i = 1
table_ascii()
Unfortunately, the result is wrong. It stops at the code 157 :
127 ->
128 ->
129 ->
130 ->
131 ->
132 ->
133 ->
134 ->
135 ->
136 ->
137 ->
138 ->
139 ->
140 ->
142 ->
143 ->
144 ->
146 ->
147 ->
148 ->
149 ->
150 ->
151 ->
152 ->
154 ->
155 ->
157 ->
I understand these codes return blank but why do they stop the process?
Setup:
- Python 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0] on linux
- Using VIM - Vi IMproved 8.1
When I run this code in Visual Studio Code, the script produces output through 256. But in my console (Linux Mate), it blocks. That's difficult to understand for me...
CodePudding user response:
Firstly, ASCII only goes up to 127 (0x7F). chr()
actually returns the Unicode character.
I think the problem is that when U 9D (157) Operating System Command (OSC) is printed, your terminal starts a control string and waits for a String Terminator like U 9C String Terminator, U 1B Escape followed by U 5C backslash, or U 7 BEL. Since none of those sequences are ever printed later, the terminal stops showing the output. For more info, see ANSI escape code § Fe Escape sequences and C1 control codes on Wikipedia.
Unicode characters U 80 (128) to U 9F (159) are control characters, meaning they're not generally printable, so you were never going to get sensible output in the first place.
CodePudding user response:
As mentioned in the comments the characters between 128 and 160 are something of a no-man's land. They are not defined in the Unicode spec but they may have special meaning for various displays. That's the reason why Unicode doesn't touch them - too many variable uses in play.
A terminal such as a Linux xterm accepts control codes to do things like display text in color. Looking at Xterm Control Sequences we see
Privacy Message (PM is 0x9e)
That's 158 decimal and its one of xterms 8-bit control characters. This starts a "private message" that continues until a defined string terminator character is seen. xterm doesn't implement "private message" and it looks from your output that it simply ignores the remaining output as being part of that message.
This is a VT100 type thing. Some terminals may implement some actions. Others may have a character mapped to that octet. You won't find any consistent implementation.