Home > OS >  How to convert this binary to find out what's behind these characters in Python
How to convert this binary to find out what's behind these characters in Python

Time:07-15

The external system gives me back such a binary. How can I convert it to know what is hidden there?

b'–0800B€      œ       0521222071414292570529606200222081'

enter image description here

How to convert ebcdic 'cp1141' to text?

CodePudding user response:

Command prompt uses UTF-8 char encoding, which is the most common and is why you aren't seeing it. You can use this site to convert between binary and UTF-8: https://onlinebinarytools.com/convert-utf8-to-binary

Alternatively, you can use this code to convert:

def utf8_to_binary(u):
    return ''.join([f'{i:08b} ' for i in u.encode()])

v = '–0800B€      œ       0521222071414292570529606200222081'

print(utf8_to_binary(v))

CodePudding user response:

You can convert it to UTF-8 or UTF-16

txt = bytes('–0800B€      œ       0521222071414292570529606200222081', 'utf-8')

or

txt = bytes('–0800B€      œ       0521222071414292570529606200222081', 'utf-16')

Then you can output it.

EDIT:

It is possible to convert the string to utf-8 and convert it to ascii:

txt = '–0800B€      œ       0521222071414292570529606200222081'
txt = txt.encode('utf-8')
ascii = txt.decode('ascii', 'ignore')
print(ascii)

The output is:

0800B             0521222071414292570529606200222081
  • Related