Home > Net >  how to generate a binary file
how to generate a binary file

Time:01-02

i am working on a school project and this is my question:

Generate a binary file that contains the table of encoding and the data of the file using the Huffman encoding.

First i need to read the data from a file and create a Huffman tree so i created it and it is all working, but i am not able to generate the binary file because the data i have are nodes not objects so i cannot put the data in the binary file and i am getting this error:

TypeError: a bytes-like object is required, not 'node'

q = {}
a_file = open("george.txt", 'r')
for line in a_file:
    key, value = line.split()

    q[key] = value


class node:
    def __init__(self, freq, symbol, left=None, right=None):
        self.freq = freq

        self.symbol = symbol

        self.left = left

        self.right = right

        self.huff = ''


def printNodes(node, val=''):
    newVal = val   str(node.huff)
    if(node.left):
        printNodes(node.left, newVal)
    if(node.right):
        printNodes(node.right, newVal)

    if(not node.left and not node.right):
        print(f"{node.symbol} -> {newVal}")


chars = ['a', 'b', 'c', 'd', 'e', 'f']

# frequency of characters
freq = [q['a'], q['b'], q['c'], q['d'], q['e'], q['f']]

nodes = []

for x in range(len(chars)):
    nodes.append(node(freq[x], chars[x]))

while len(nodes) > 1:
    nodes = sorted(nodes, key=lambda x: x.freq)

    left = nodes[0]
    right = nodes[1]
    left.huff = 0
    right.huff = 1
    newNode = node(left.freq right.freq, left.symbol right.symbol, left, right)
    nodes.remove(left)
    nodes.remove(right)
    nodes.append(newNode)

printNodes(nodes[0])
with open('binary.bin', 'wb') as f:
    f.write(nodes[0])

CodePudding user response:

The process of converting structured objects to a binary form is called "serialization", so a search for "python serialization" is where you'd normally want to start. It's an integral part of most programming languages and comes in many forms. The defacto serialization method in python is called Pickle and is in the python package pickle.

Pickle lets you convert objects to a binary representation and vice versa, handling lots of little protocol details for you.

In your example you have:

with open('binary.bin', 'wb') as f:
    f.write(nodes[0])

You can serialize that to binary form like this:

import pickle

with open('binary.bin', 'wb') as f:
    b = pickle.dumps(nodes[0])  # bytes representation of your object
    f.write(b)                  # you can now write the bytes

You can also use shorthand methods such as the following to save all nodes in one line:

pickle.dump('binary.bin', nodes)

Deserialization looks similar:

with open('binary.bin', 'rb') as f:
    b = f.read()
    node0 = pickle.loads(b)

or

nodes = pickle.load('binary.bin')

Here are some related posts:

  • Related