Home > Software design >  How to split one large PDF file into several smaller files no larger then 10Mb
How to split one large PDF file into several smaller files no larger then 10Mb

Time:09-24

Suppose I have one large PDF file of 36Mb in size, I'd like to split this file into several smaller files no larger than 10Mb each.

So far I have written code to check a file's size, and if larger than 10Mb split the file into two files with half of the pages from the original in the first and the other half in the second (Using the DevExpress PDF API)

I'd like it to then recursively check if each newly created file still exceeds the 10Mb limit and split these files further until they are within the limit.

However given that splitting the file by page count does not necessarily halve the file size, my issue is maintaining the order of the original document.

For example: ABC.pdf - 36Mb split into two files could produce: ABC_1.pdf - 8Mb ABC_2.pdf - 28Mb

In which case ABC_2.pdf would need to be split further while ABC_1.pdf would not.

Is it possible to keep splitting a file of arbitrary size until it meets the size requirements and maintain the original document order with this in mind?

CodePudding user response:

I do not have access to devexpress but i have a code sample using PDFSharp.

I Hope it helps:

public void Split(string filePath, int maxSizeInBytes)
{
    var sourceFile = PdfReader.Open(filePath, PdfDocumentOpenMode.Import);
    var targetFilesCount = 1;
    var targetFileName = Path.GetFileNameWithoutExtension(filePath);
    var targetFile = new PdfDocument();

    string targetName() => $"{targetFileName}_{targetFilesCount}.pdf";

    for (int i = 0; i < sourceFile.Pages.Count; i  )
    {
        targetFile.Pages.Add(sourceFile.Pages[i]);
        targetFile.Save(targetName());

        var targetFileSize = PdfReader.Open(targetName(), PdfDocumentOpenMode.ReadOnly).FileSize;

        if (targetFileSize > maxSizeInBytes)
        {
            targetFile.Pages.Remove(targetFile.Pages[targetFile.Pages.Count - 1]);
            targetFile.Save(targetName());

            targetFilesCount  ;

            targetFile = new PdfDocument();
            targetFile.Pages.Add(sourceFile.Pages[i]);
            targetFile.Save(targetName());
        }
    }
}

Call it like this:

Split("ABC.pdf", 1000000 * 10);
  • Related