I am trying to edit some print files, these consist of a very short (<100 bytes) header (in binary) a number of @PJL...
lines (readable as plain text) followed by binary in one of a number of different languages.
While it would be possible to process the entire file as binary it would be far easier to process the @PJL lines as plain text. This would require reading the part of the file between the first @PJL
and first \r
not followed be @PJL
as text.
e.g.
header
@PJL ... \r
@PJL ... \r
@PJL ... \r
b\01b\07...
Note: While the files can get quite large, the @PJL lines are always quite short (a couple of dozen lines at the most) so there is no issue with reading them into memory as a single block.
If you know how this may be achieved or can point me in the right direction I would be very greatful.
Thanks,
CodePudding user response:
As you have mentioned in the comments, you want to be able to parse the file line by line, not all at once. You can do that if you iterate through the file one byte at a time, and build the lines yourself:
with open("file.b", "rb") as f:
line = bytearray()
while (b := f.read(1)): # Read one byte
if b == b'\r': # end of line
# process line here
if line[0] == ord(b'@'): # Line starting with @
text_line = line.decode()
else:
pass # process binary line
line = bytearray() # Reset line
else:
line = b # Add byte to line