Appending encrypted data to file-CodePudding

I am using the cryptography library for python. My goal is to take a string, encrypt it and then write to to a file.

This may be done multiple times, with each appending to the end of the file additional data; which is also encrypted.

I have tried a few solutions, such as:

Using the hazmat level API to avoid as much meta data stored in the encrypted text.
Writing each encrypted string to a new line in a text file.

This is the code that uses ECB mode and the hazmat API. It attempts to read the file and decrypt line by line. I understand it is unsafe, my main use is to log this data only locally to a file and then use a safe PKCS over the wire.

from cryptography import fernet

key = 'WqSAOfEoOdSP0c6i1CiyoOpTH2Gma3ff_G3BpDx52sE='
crypt_obj = fernet.Fernet(key)
file_handle = open('test.txt', 'a')

data = 'Hello1'
data = crypt_obj.encrypt(data.encode())
file_handle.write(data.decode()   '\n')
file_handle.close()


file_handle_two = open('test.txt', 'a')
data_two = 'Hello2'
data_two = crypt_obj.encrypt(data_two.encode())
file_handle_two.write(data_two.decode()   '\n')
file_handle_two.close()


file_read = open('test.txt', 'r')
file_lines = file_read.readlines()
file_content = ''
for line in file_lines:
    line = line[:-2]
    file_content = crypt_obj.decrypt(line.encode()).decode()
    print(file_content)
file_read.close()

For the code above I get the following error:

Traceback (most recent call last):
  File "C:\Dev\Python\local_crypt_test\venv\lib\site-packages\cryptography\fernet.py", line 110, in _get_unverified_token_data
    data = base64.urlsafe_b64decode(token)
  File "C:\Users\19097\AppData\Local\Programs\Python\Python39\lib\base64.py", line 133, in urlsafe_b64decode
    return b64decode(s)
  File "C:\Users\19097\AppData\Local\Programs\Python\Python39\lib\base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Dev\Python\local_crypt_test\main.py", line 25, in <module>
    file_content = crypt_obj.decrypt(line.encode()).decode()
  File "C:\Dev\Python\local_crypt_test\venv\lib\site-packages\cryptography\fernet.py", line 83, in decrypt
    timestamp, data = Fernet._get_unverified_token_data(token)
  File "C:\Dev\Python\local_crypt_test\venv\lib\site-packages\cryptography\fernet.py", line 112, in _get_unverified_token_data
    raise InvalidToken
cryptography.fernet.InvalidToken

Process finished with exit code 1

These examples are only to demonstrate the issue, my real code looks much different so you may ignore errors in the example that do not pertain to my main issue. That is, appending encrypted data to a file and decrypting/reading that data from the file at a later time. The file does not need to be in any specific format, as long as it can be read from and decrypted to obtain the original message. Also, the mode of operation is not tied to ECB, if your example uses another type, that works too.

I am honestly stumped and would appreciate any help!

CodePudding user response：

from cryptography import fernet

key = 'WqSAOfEoOdSP0c6i1CiyoOpTH2Gma3ff_G3BpDx52sE='
crypt_obj = fernet.Fernet(key)
file_handle = open('test.txt', 'a')

data = 'Hello1'
data = crypt_obj.encrypt(data.encode('utf-8'))
file_handle.write(data.decode('utf-8')   '\n')
file_handle.close()


file_handle_two = open('test.txt', 'a')
data_two = 'Hello2'
data_two = crypt_obj.encrypt(data_two.encode('utf-8'))
file_handle_two.write(data_two.decode('utf-8')   '\n')
file_handle_two.close()


file_read = open('test.txt', 'r')
file_lines = file_read.readlines()
file_content = ''
for line in file_lines:
    # line = line[:-2]
    file_content = crypt_obj.decrypt(line.encode('utf-8')).decode()
    print(file_content)
file_read.close()

By removing the last characters from the string you also remove important characters for decoding.

CodePudding user response：

There are a couple details at play here...

1. Trailing newline character(s) are included in each line

When you loop through file_lines, each line includes the trailing newline character(s).

I say "character(s)" because this can vary based on the platform (e.g. Linux/macOS = '\n' versus Windows = '\r\n').

2. base64 decoding silently discards invalid characters

Fernet.encrypt(data) returns a bytes instance containing a base64 encoded "Fernet token".

Conversely, the first step Fernet.decrypt(token) takes is decoding the token by calling base64.urlsafe_b64decode(). This function uses the default non-validating behavior in which characters not within the base64 set are discarded (described here).

Note: This is why the answer from TheTS happens to work despite leaving the extraneous newline character intact.

Solution

I'd recommend making sure you provide Fernet.decrypt() the token exactly as produced by Fernet.encrypt(). I'm guessing this is what you were trying to do by stripping the last two characters.

Here's an approach that should be safe and not platform dependent.

When you call open() for writing, provide the newline='\n' argument to prevent the default behavior of converting instances of '\n' to the platform dependent os.linesep value (in the section describing the newline argument, see the second bullet point detailing how the argument applies when writing files).
When processing each line, use rstrip('\n') to remove the expected trailing newline.

Here's a code example that demonstrates this:

#!/usr/bin/python3

from cryptography import fernet

to_encrypt = ['Hello1', 'Hello2']
output_file = 'test.txt'

key = 'WqSAOfEoOdSP0c6i1CiyoOpTH2Gma3ff_G3BpDx52sE='
crypt = fernet.Fernet(key)

print("ENCRYPTING...")
for data in to_encrypt:
    data_bytes = data.encode('utf-8')
    token_bytes = crypt.encrypt(data_bytes)
    print(f'data: {data}')
    print(f'token_bytes: {token_bytes}\n')
    with open(output_file, 'a', newline='\n') as f:
        f.write(token_bytes.decode('utf-8')   '\n')

print("\nDECRYPTING...")
with open(output_file, 'r') as f:
    for line in f:
        # Create a copy of line which shows the trailing newline.
        line_escaped = line.encode('unicode_escape').decode('utf-8')
        line_stripped = line.rstrip('\n')
        token_bytes = line_stripped.encode('utf-8')
        data = crypt.decrypt(token_bytes).decode('utf-8')
        print(f'line_escaped: {line_escaped}')
        print(f'token_bytes: {token_bytes}')
        print(f'decrypted data: {data}\n')

Output:

Note the trailing newline when line escaped is printed.

$ python3 solution.py
ENCRYPTING...
data: Hello1
token_bytes: b'gAAAAABi-LAo-h8w-ayc267hrLbswMZtkT4RQQ9wt0EusYNrZGjuzbpyRLoKDZZF4oQPOU-iH1PnCc7vSIOoTVMLlCFnHTkN6A=='

data: Hello2
token_bytes: b'gAAAAABi-LAoHUT8Iu1bVMcGSIrFRvtVZQFh4O52XYSCgd0leYWS-n38irhv3Ch7oEx6SXazHwAL7a57ncFoMJTQQAms52yf3w=='


DECRYPTING...
line_escaped: gAAAAABi-LAo-h8w-ayc267hrLbswMZtkT4RQQ9wt0EusYNrZGjuzbpyRLoKDZZF4oQPOU-iH1PnCc7vSIOoTVMLlCFnHTkN6A==\n
token_bytes: b'gAAAAABi-LAo-h8w-ayc267hrLbswMZtkT4RQQ9wt0EusYNrZGjuzbpyRLoKDZZF4oQPOU-iH1PnCc7vSIOoTVMLlCFnHTkN6A=='
decrypted data: Hello1

line_escaped: gAAAAABi-LAoHUT8Iu1bVMcGSIrFRvtVZQFh4O52XYSCgd0leYWS-n38irhv3Ch7oEx6SXazHwAL7a57ncFoMJTQQAms52yf3w==\n
token_bytes: b'gAAAAABi-LAoHUT8Iu1bVMcGSIrFRvtVZQFh4O52XYSCgd0leYWS-n38irhv3Ch7oEx6SXazHwAL7a57ncFoMJTQQAms52yf3w=='
decrypted data: Hello2