Home > Net >  Python: bytearray object becomes bytes (immutable) when populated
Python: bytearray object becomes bytes (immutable) when populated

Time:02-20

I'm trying to read the raw contents of a binary file, so they can be manipulated in memory. As far as I understand, bytes() objects are immutable, while bytearray() objects are mutable, so I read the file into a bytearray and then try to modify the latter:

raw_data = bytearray()

try:
    with open(input_file, "rb") as f:
         raw_data = f.read()
except IOError:
    print('Error opening', input_file)

raw_data[0] = 55   # attempt to modify the first byte

However this last line results in a TypeError: 'bytes' object does not support item assignment.
Wait... what 'bytes' object?

Let's look into the actual data types reported by Python, before and after the array is populated:

raw_data = bytearray()
print('Before:', type(raw_data))

try:
    with open(input_file, "rb") as f:
         raw_data = f.read()
except IOError:
    print('Error opening', input_file)

print('After: ', type(raw_data))

Output:

Before: <class 'bytearray'>
After:  <class 'bytes'>

So what's going on here? Why is the type modified, and can I prevent it?

I can always create another bytearray object from the contents of raw_data, but it'd be nice if I could save memory and just modify the original in place.

CodePudding user response:

Why is the type modified? Look at the following:

>>> x = 12
>>> type(x)
<class 'int'>
>>> x = 7.0
>>> type(x)
<class 'float'>

Sure, I assigned a value of 12 to x and as a result x had type int. But then I assigned a new value of 7.0 to x and that changed the type of value that x had. This is fundamental Python dynamic typing being demonstrated.

So it doesn't matter that you initially assigned a bytearray instance to raw_data. What counts is the last assignment to raw_data, which was:

raw_data = f.read()

And the call to f.read() returns class bytes.

The way you get around this is by pre-allocating a bytearray with the correct size and using readinto:

with open(input_file, mode="rb") as f:
    # Seek to end of file and return offset from beginning:
    file_size = f.seek(0, 2)
    # Seek back to beginning:
    f.seek(0, 0)
    # Pre-alllocate bytearray:
    raw_data = bytearray(file_size)
    f.readinto(raw_data)
  • Related