Home > Enterprise >  RSA Encrypted data convert from bytes to string and back to bytes?
RSA Encrypted data convert from bytes to string and back to bytes?

Time:04-03

I am trying to implement a Symmetric-key agreement scheme using public-key cryptography between multiple clients via socket communication and I have been testing the encryption and decryption functionality.

import rsa

def generateAKeys():
    (publicKey, privateKey) = rsa.newkeys(1024)
    with open('keys/APubKey.pem', 'wb') as p:
        p.write(publicKey.save_pkcs1('PEM'))
    with open('keys/APrivKey.pem', 'wb') as p:
        p.write(privateKey.save_pkcs1('PEM'))

def generateBKeys():
    (publicKey, privateKey) = rsa.newkeys(1024)
    with open('keys/BPubKey.pem', 'wb') as p:
        p.write(publicKey.save_pkcs1('PEM'))
    with open('keys/BPrivKey.pem', 'wb') as p:
        p.write(privateKey.save_pkcs1('PEM'))

def loadKeys():
    with open('keys/APubKey.pem', 'rb') as p:
        APubKey = rsa.PublicKey.load_pkcs1(p.read())
    with open('keys/APrivKey.pem', 'rb') as p:
        APrivKey = rsa.PrivateKey.load_pkcs1(p.read())
    with open('keys/BPubKey.pem', 'rb') as p:
        BPubKey = rsa.PublicKey.load_pkcs1(p.read())
    with open('keys/BPrivKey.pem', 'rb') as p:
        BPrivKey = rsa.PrivateKey.load_pkcs1(p.read())
    return APubKey, APrivKey, BPubKey, BPrivKey

def encrypt(message, key):
    return rsa.encrypt(message.encode('utf8'), key)

def decrypt(ciphertext, key):
    try:
        return rsa.decrypt(ciphertext, key).decode('utf8')
    except:
        return False

def sign(message, key):
    return rsa.sign(message.encode('utf8'), key, 'SHA-1')

def verify(message, signature, key):
    try:
        return rsa.verify(message.encode('utf8'), signature, key,) == 'SHA-1'
    except:
        return False


APubKey, APrivKey, BPubKey, BPrivKey = loadKeys()

message = 'Hello'
ciphertext = encrypt(message, BPubKey)
print(ciphertext)
print(type(ciphertext))

signature = sign(message, APrivKey)

message1 = "{}|{}".format(ciphertext, signature)

#Simulating message transfer over socket

message2 = message1 #Recived Messgae

message2 = message2.split("|")

ciphertext1, signature1 = message2[0], message2[1]

print(ciphertext1)
print(type(ciphertext1))

print(ciphertext == ciphertext1)
plaintext = decrypt(ciphertext1, BPrivKey)

if plaintext:
    print(f'Message text: {plaintext}')
else:
    print(f'Unable to decrypt the message.')

if verify(plaintext, signature1, APubKey):
    print('Successfully verified signature')
else:
    print('The message signature could not be verified')

The information being sent back and forth between clients is being packaged up in the following format message1 = "{}|{}".format(ciphertext, signature) which allows the message to be split up and read in individual pieces. In this file, I am just packaging the message in and out of this format to simulate the transmission between clients.

My issue is that once the encrypted message is created (which is in bytes) and packaged into the format shown above, it converts in as a string. When I try to extract that encrypted message and decrypt it, I am unable as it is no longer in bytes format. I have tried converting the string into bytes but to no avail. Any help would be appreciated.

Below is a copy of the output whereby the encrypted bytes are converted to the exact same string

b'\x8d\x19b\xbbE\xc1\xbf/K\x8b_}\xae\x0c\xb3\x8b\x94\x19\xfb\x8e\x01q6\xf5\xdd2O\\\xd2\xbf\xe3\xca\xcf\xac\x03\x84\xe9\xd7\xce\x13\xaaB\x16x\x13\xb4x26\xfc\x1c\xfe6\x82\xf6\x89i\x8aT\x87\xa0\xe9\x85p\xea\x03\x0fK\xb1/\xe0\x1b\x10a\x83\xa2\x0b}b\x0b\xc3\xe1"\xc1\x94\xfa\x95\xb0iQ\xa8%sqs\xc9\x98`gd,\xdc;\xa3\x08\xb6\xc3T:2N\xede-\x16\xe6i\xdc?3\x1d\x8c\x12^\x10\xde*\xc5'
<class 'bytes'>
b'\x8d\x19b\xbbE\xc1\xbf/K\x8b_}\xae\x0c\xb3\x8b\x94\x19\xfb\x8e\x01q6\xf5\xdd2O\\\xd2\xbf\xe3\xca\xcf\xac\x03\x84\xe9\xd7\xce\x13\xaaB\x16x\x13\xb4x26\xfc\x1c\xfe6\x82\xf6\x89i\x8aT\x87\xa0\xe9\x85p\xea\x03\x0fK\xb1/\xe0\x1b\x10a\x83\xa2\x0b}b\x0b\xc3\xe1"\xc1\x94\xfa\x95\xb0iQ\xa8%sqs\xc9\x98`gd,\xdc;\xa3\x08\xb6\xc3T:2N\xede-\x16\xe6i\xdc?3\x1d\x8c\x12^\x10\xde*\xc5'
<class 'str'>
False
Unable to decrypt the message.
The message signature could not be verified

CodePudding user response:

I found he cause, modified your code so that it works (you still have compile errors, BTW), but I'm not going to post it here as it looks ugly (code style wise - check [Python.PEPs]: PEP 8 - Style Guide for Python Code), and I don't feel like formatting the whole thing.

The problem is that message1 = "{}|{}".format(ciphertext, signature) messes up the (byte) strings, and when the message is split back (message2 = message2.split("|")) the results are different from ciphertext and signature.
Note that these things are easy to spot by simply printing variables (print(message2, ciphertext, signature)).

It seems to me that it wouldn't hurt you to get more familiar with [Python.Docs]: Built-in Types - Bytes Objects.

Possible fix:

  • When sending

    SEPARATOR = b"|"  # Notice the `b` prefix (byte string literal)
    message1 = ciphertext   SEPARATOR   signature
    
  • When receiving:

    message2 = message2.split(SEPARATOR)
    

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q071720539]> "e:\Work\Dev\VEnvs\py_pc064_03.09_test0\Scripts\python.exe" orig.py
b"~\xd2\xdc\xdf\xde\x84U\xe0\x88\x82\x1e\xa4\x17Q`Wu\x03\x04)\xac\x88\xe9P\xa5E\x05I\x1d_\xa7\x03%\x0ckYZ\xf5\xc6\xcd=,nTF\xf4\x14\xae\x96E-\xa3I\x06\x06\x84_W\xd3f\xefQ3_;$\x03\x9e\xc1\x94P\xab\xf8\xa0\xaf#'\xdf\xaf\n< \xec\x14\x9b9E\xce\x9e 9\xaeH\xf4)R\xed\xdaD\xc5\x9e^j\xd7L>W\x07i\x91^T\xd6u\xf4E\x9dk\xe5VV\x04_\xa2s\xad\xae\xad"
<class 'bytes'>
True
b"~\xd2\xdc\xdf\xde\x84U\xe0\x88\x82\x1e\xa4\x17Q`Wu\x03\x04)\xac\x88\xe9P\xa5E\x05I\x1d_\xa7\x03%\x0ckYZ\xf5\xc6\xcd=,nTF\xf4\x14\xae\x96E-\xa3I\x06\x06\x84_W\xd3f\xefQ3_;$\x03\x9e\xc1\x94P\xab\xf8\xa0\xaf#'\xdf\xaf\n< \xec\x14\x9b9E\xce\x9e 9\xaeH\xf4)R\xed\xdaD\xc5\x9e^j\xd7L>W\x07i\x91^T\xd6u\xf4E\x9dk\xe5VV\x04_\xa2s\xad\xae\xad"
<class 'bytes'>
True
Message text: Hello
Successfully verified signature

Needless to say that SEPARATOR needs to be the same on both ends.

Note: The pipe character(|, vertical bar) seems to be a too simple value. If one of your input strings contains it (and there's no guarantee that it wouldn't), you'll have a big surprise on the receiving end, when splitting the message. Therefore, in order to drastically reduce the collision chance, you should set it to something more complex:

SEPARATOR = b"~!@#$%^&*()_ -=AbCd1234"

CodePudding user response:

This is a normal process; after encryption you have surely non-printable characters, possible even more than printable ones.

To make the process more symmetric you should use byte strings instead of strings for plain text as well as cipghertext, which in Python are indicated by a leading b as in b"hello". (Available cryptography modules also require byte strings for everything, and this is the only way to encrypt files, like e.g. a picture). Note, that there is no difference between bytes and byte strings, its just a different notation, so you don't have to look for the encoding for each normal character - if you already have bytes, no conversion is required

Especially in Python 3 with its intrinsic use of unicode a look at the codecs unit may be helpful.

  • Related