Home > OS >  pyPDF2 PdfFileWriter output returns a corrupted file
pyPDF2 PdfFileWriter output returns a corrupted file

Time:04-02

I am very new to python. I have the following code that takes user input from a GUI for the "x" and "a" variable. The goal is to have it open each .pdf in the directory perform the modifications, and save over itself. Each pdf in the directory is a single page pdf. It seems to work however, the newly saved file is corrupted and cannot be opened.

Seal_pdf = PdfFileReader(open(state, "rb"), strict=False)
input_pdf = glob.glob(os.path.join(x, '*.pdf'))
output_pdf = PdfFileWriter()
page_count = len(fnmatch.filter(os.listdir(x), '*.pdf'))
i = 0

if a == "11x17":
    for file in input_pdf:
        sg.OneLineProgressMeter('My Meter', i, page_count, 'And now we Wait.....')
        PageObj = PyPDF2.PdfFileReader(open(file, "rb"), strict=False).getPage(0)
        PageObj.scaleTo(11*72, 17*72)
        PageObj.mergePage(Seal_pdf.getPage(0))
        output_filename = f"{file}"
        f = open(output_filename, "wb ")
        output_pdf.write(f)
        i = i   1

Adding output_pdf.addPage(PageObj) to the loop produces and uncorrupted file however, that causes each successive .pdf to be added to the previous .pdf. (ex. "pdf 1" is only "pdf 1", "pdf2 is now two pages "pdf1" and "pdf2" merged, etc.). I also attempted to change the next to last two lines to

with open(output_filename, "wb ") as f:
    output_pdf.write(f)

with no luck. I can't figure out what I am missing to have the PdfFileWriter return a single page, uncorrupted file for each individual pdf in the directory.

if a == "11x17":
    for file in input_pdf:
        sg.OneLineProgressMeter('My Meter', i, page_count, 'And now we Wait.....')
        PageObj = PyPDF2.PdfFileReader(open(file, "rb"), strict=False).getPage(0)
        PageObj.scaleTo(11*72, 17*72)
        PageObj.mergePage(Seal_pdf.getPage(0))
        output_pdf.addPage(PageObj)
        output_filename = f"{file}"
        f = open(output_filename, "wb ")
        output_pdf.write(f)
        i = i   1

CodePudding user response:

I was able to solve this finally by simply putting the output_pdf = PdfFileWriter() inside the loop. I stumbled across that being the solution for another loop issue and thought I would try it. PdfFileWriter() inside loop

  • Related