Home > Blockchain >  How to copy everything from PDF page to PDF page?
How to copy everything from PDF page to PDF page?

Time:11-09

I have several PDFs with pages of an odd-size, somewhat smaller than 8.5x11. These pages need to be 8.5x11 with their contents centered and without resizing. This could be achieved by printing the pages on 8.5x11 paper at actual size and orientation set to auto, and then re-scanning the printed pages, but what a waste of time and paper and possible loss of quality. After much googling and searching https://api.itextpdf.com/, I came up with this in C#, using iText 7.1.14:

public void FixSize() // method in PDFentry class
{
    // FQFName is the string property Fully Qualified File Name
    PdfDocument srceDoc = new PdfDocument(new PdfReader(FQFName));
    int pageCount = srceDoc.GetNumberOfPages();
    string tempFile = Path.GetTempFileName();
    PdfDocument destDoc = new PdfDocument(new PdfWriter(tempFile));
    iText.Kernel.Geom.PageSize newPageSize = iText.Kernel.Geom.PageSize.LETTER;
    // this is in a try/catch, removed here for brevity
    for (int page = 1; page <= pageCount; page  )
    {
        PdfPage srcPage = srceDoc.GetPage(page);
        PdfPage dstPage = destDoc.AddNewPage(newPageSize);
        PageXObject pageXObject = srcPage.CopyAsFormXObject(destDoc);
        PdfCanvas pdfCanvas = new PdfCanvas(dstPage);
        // 18 is temporary to be replaced with a variable parameter
        pdfCanvas.AddXObjectAt(pageXObject, 18, 18);
        pdfCanvas.Release();
    }
    srceDoc.Close();
    destDoc.Close();
    File.Copy(tempFile, FQFName, true);
    File.Delete(tempFile);
}

It works as far as the basic content of the pages is concerned, but it loses bookmarks and probably comments as well (I don't have PDFs with comments). In fact, it's exactly as if the pages were printed and re-scanned, so I guess be careful what you ask for, but what would I do to copy everything from the page, essentially duplicating the source PDF, just on larger pages?

For what it's worth, the PDFs will never include live forms, portfolios, PDFs with attachments, or PDFs with audio or video. They will always be PDFs scanned from paper or "printed" from applications.

CodePudding user response:

As @KJ already hinted at in comments, if one wants to change some detail of a document but leave everything as is, one should not try to copy the PDF page by page and hope everything remains intact, let alone create form XObjects from the original pages and add them to new pages which is even more lossy.

In iText you should instead process the PDF in stamping mode, i.e. create a PdfDocument based on both a PdfReader and PdfWriter, and merely apply the desired changes.

In the case at hand that can be done like this (iText 7.2.0):

float resultWidth = 8.5f * 72;
float resultHeight = 11f * 72;

using (PdfDocument pdfDocument = new PdfDocument(new PdfReader(INFILE), new PdfWriter(OUTFILE)))
{
    for (int i = 1; i <= pdfDocument.GetNumberOfPages(); i  )
    {
        var page = pdfDocument.GetPage(i);
        var cropBox = page.GetCropBox();
        var newCropBox = new Rectangle(cropBox.GetLeft() - (resultWidth - cropBox.GetWidth()) / 2,
            cropBox.GetBottom() - (resultHeight - cropBox.GetHeight()) / 2, resultWidth, resultHeight);
        var mediaBox = page.GetMediaBox();
        var newMediaBox = Rectangle.GetCommonRectangle(mediaBox, newCropBox);
        page.SetMediaBox(newMediaBox);
        page.SetCropBox(newCropBox);
    }
}

(ResizePages test ResizeForRobertSF)

In stamping mode we here inspect the crop box and media box of each page of the original PDF, resize the former to 8.5"×11", and make sure that the media box encompasses the updated crop box.

  • Related