Home > Software engineering >  Using tempfile to insert barcodes into PDF file
Using tempfile to insert barcodes into PDF file

Time:11-30

I'm working on a project where I need to use one large PDF file with 100,000's of images, where I need to insert a custom/variable barcode on every nth page (conditional dependant).

The contents of the barcode will change for every insertion, for this example, let's just say based on iteration.

I've used PyMuPDF to manipulate PDFs in the past, including inserting images. I've tested inserting barcodes when they're saved to file, and have no issues.

I've used Treepoem in the past to generate custom barcodes as required, on a much smaller scale.

(This is still in planning/proof of concept phase) So my concern is that if I'll be doing this at a larger scale, I'll be limited by disk read/write speeds.

I understand that python has a tempfile library, that I've never used. I'm attempting to leverage this to generate and save barcodes to tempfiles in memory, and then insert them into the PDF file from memory, rather than from disk/file.

I've tested and confirmed that generating a barcode and saving it to file allows me to insert into the PDF file as required. Below example:

import fitz
import treepoem

barcode_file = treepoem.generate_barcode(
    barcode_type='datamatrixrectangular',
    data='10000010'
).convert('1').save('barcode_file.jpg') # Convert('1') forces monochrome, reducing file size.

pdf_file = fitz.open()  # Creating a new file for this example.
pdf_file.new_page()  # Inserting a new blank page.
page = pdf_file[0]

rect = fitz.Rect(70, 155, 200, 230)  # Generic area defined, required to insert barcode into. (x0, y0, x1, y1)

page.insert_image(rect, filename='barcode_file.jpg')
pdf_file.save('example_pdf_with_barcode.pdf')

When trying to implement tempfile to remove saving to file, I'm not sure where to utilise it.

I've tried creating a new tempfile object, inserting the barcode image into it.

import fitz
import tempfile
import treepoem

barcode_contents = treepoem.generate_barcode(
    barcode_type='datamatrixrectangular',
    data='10000010'
).convert('1')

barcode_tempfile = tempfile.TemporaryFile()
barcode_tempfile.write(b'{barcode_contents}')  # Like f-string, with binary?
barcode_tempfile.seek(0)  # Required, not understood.

pdf_file = fitz.open()  # Creating a new file for this example.
pdf_file.new_page()  # Inserting a new blank page.
page = pdf_file[0]

rect = fitz.Rect(70, 155, 200, 230)  # Generic area defined, required to insert barcode into. (x0, y0, x1, y1)

page.insert_image(rect, filename=barcode_tempfile)
pdf_file.save('example_pdf_with_barcode.pdf')

Which returns a permission based error:

  File "<redacted>\example.py", line 20, in <module>
    page.insert_image(rect, filename=barcode_tempfile)
  File "<redacted>\venv\Lib\site-packages\fitz\utils.py", line 352, in insert_image
    xref, digests = page._insert_image(
                    ^^^^^^^^^^^^^^^^^^^
  File "<redacted>\venv\Lib\site-packages\fitz\fitz.py", line 6520, in _insert_image
    return _fitz.Page__insert_image(self, filename, pixmap, stream, imask, clip, overlay, rotate, keep_proportion, oc, width, height, xref, alpha, _imgname, digests)
           
RuntimeError: cannot open <redacted>\AppData\Local\Temp\tmpr_98wni9: Permission denied

I've looked for said temp file in the specified directory, which can't be found. So I can't figure out how to trouble shoot this.

Treepoem's barcode generator also has a save() method, where you can typically save to file. I've tried to save to a tempfile instead, as below:

import fitz
import tempfile
import treepoem

treepoem.generate_barcode(
    barcode_type='datamatrixrectangular',
    data='10000010'
).convert('1').save(tempfile.TemporaryFile('barcode_tempfile'))

pdf_file = fitz.open()  # Creating a new file for this example.
pdf_file.new_page()  # Inserting a new blank page.
page = pdf_file[0]

rect = fitz.Rect(70, 155, 200, 230)  # Generic area defined, required to insert barcode into. (x0, y0, x1, y1)

page.insert_image(rect, filename=barcode_tempfile)
pdf_file.save('example_pdf_with_barcode.pdf')

Which results in the below error:

File "<redacted>\example.py", line 8, in <module>
    ).convert('1').save(tempfile.TemporaryFile('barcode_tempfile'))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<redacted>\AppData\Local\Programs\Python\Python311\Lib\tempfile.py", line 563, in NamedTemporaryFile
    file = _io.open(dir, mode, buffering=buffering,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid mode: 'barcode_tempfile'

So I'm unsure if I can save to a tempfile via this method.

Would anyone be able to explain if this is possible, how best to tackle it?

(Currently using python 3.11)

Thanks,

CodePudding user response:

Your problems are in the area of tempfile handling. Instead of going into detail, I suggest to stick with Pillow and use its facilities exclusively:

  • convert the PIL image to a JPEG / PNG as you did, but let PIL save to a memory file
  • insert that memory image using PyMuPDF
import io  # need this for memory output
fp = io.BytesIO()  # memory binary file
treepoem.generate_barcode(
    barcode_type='datamatrixrectangular',
    data='10000010'
).convert('1').save(fp, "JPEG"))  # write image to memory

# now insert image into page using PyMuPDF
# fp.getvalue() delivers the image content in memory
page.insert_image(rect, stream=fp.getvalue())
  • Related