Home > Software design >  Python3 interpret user input string as raw bytes (e.g. \x41 == "A")
Python3 interpret user input string as raw bytes (e.g. \x41 == "A")

Time:05-01

  • I want to accept user input from the command line using the input() function, and I am expecting that the user provides input like \x41\x42\x43 to input "ABC". The user MUST enter input in the byte format, they can not provide the alphanumeric equivalent.

  • My issue is that when I take in user input, and then print it out, I see that python tries to escape the backslash with another backslash, so it is not able to interpret it as the byte it represents in ASCII.

Example Code from Python3 Command Prompt:

1 | >>> var_abc = "\x41\x42\x43"
2 | >>> print(var_abc)
3 | ABC
4 | >>> print(bytes(var_abc, encoding='ascii'))
5 | b'ABC'

6 | >>> user_input_abc = input('enter user input in bytes: ')
7 | enter user input in bytes: \x41\x42\x43
8 | >>> print(user_input_abc)
9 | \x41\x42\x43
10| >>> print(bytes(user_input_abc, encoding='ascii'))
11| b'\\x41\\x42\\x43'
  • I want the output on Line 11 to be the same as the output on Line 5. What do I need to do to make python interpret my user input as raw bytes and not escape each preceding backslash?

CodePudding user response:

To interpret a user input string as raw bytes. You can encode the string, then decode those bytes using the "unicode_escape" encoding and then encode that string again to get the bytes object

user_input_abc = '\\x41\\x42\\x43'
print(user_input_abc) # \x41\x42\x43
user_input_escaped = user_input_abc.encode().decode('unicode_escape')
print(user_input_escaped) # 'ABC'
user_input_bytes = user_input_escaped.encode()
print(user_input_bytes) # b'ABC'
  • Related