Home > Blockchain >  End of file exception - Merge pdf using pdfbox
End of file exception - Merge pdf using pdfbox

Time:07-29

I have pdf files on S3 and would like to merge into one pdf.

I am using array of Inputstream reading file from s3 and then doing merge using pdf merge utility. Below code works fine while adding all inputstream at once ( after for loop ) but doesn't work for individual inputstream. Missing something on IO operation on inputstream like closing stream :(

     for (PS3ObjectStream pFileS3Obj: PS3ObjectStream ) {
                    try {
                        pdfMerger.addSource(pFileS3Obj.getS3ObjectInputStream());
                        
                    }catch(Exception e) {
                        e.printStackTrace();
                    }
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
}

Not working:

for (PS3ObjectStream pFileS3Obj: PS3ObjectStream ) {
                try {
                    pdfMerger.addSource(pFileS3Obj.getS3ObjectInputStream());
                    pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
                }catch(Exception e) {
                    e.printStackTrace();
                }
}

I am getting error : End-of-file , expected line exception.

Any pointers on how should I look into this issue.

java.io.IOException: Error: End-of-File, expected line
at org.apache.pdfbox.pdfparser.BaseParser.readLine(BaseParser.java:1107)
at org.apache.pdfbox.pdfparser.COSParser.parseHeader(COSParser.java:2650)
at org.apache.pdfbox.pdfparser.COSParser.parsePDFHeader(COSParser.java:2633)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:219)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1230)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1148)
at org.apache.pdfbox.multipdf.PDFMergerUtility.legacyMergeDocuments(PDFMergerUtility.java:455)
at org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:346)
at com.djcs.pslintegration.pdfutility.mergeService.MergePdfFileService.mergePDfFiles(MergePdfFileService.java:92)
at com.djcs.pslintegration.pdfutility.mergeService.MergePdfFileService.combinePdfFiles(MergePdfFileService.java:47)
at com.djcs.pslintegration.pdfutility.controller.PDFUtilityController.combinePdfFiles(PDFUtilityController.java:98)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1067)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:963)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:909)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:681)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:764)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:360)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:890)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1743)

CodePudding user response:

This is because PDFMergerUtility keeps a list of sources and does not reset that after mergeDocuments() is done, so what you're doing in the second code segment is to re-run the merging on the same list of input streams several times, despite that the first input stream has already been consumed.

My requirement is to Merge one file at a time and capture any failed file into response object , with Second option i will not be able to capture any failed file

So if you want to merge one file at a time, then it would be better to use the appendDocument() method instead of the mergeDocuments() method (and start with am empty PDDocument, and load the current PDDocument from s3 before the call).

Here's a code attempt (I can't test it because I don't use s3; there's also the risk that source is closed too early if any resources are used in the destination).

PDDocument destination = new PDDocument();
for (PS3ObjectStream pFileS3Obj : PS3ObjectStream)
{
    PDFMergerUtility pdfMerger = new PDFMergerUtility();
    PDDocument source;
    try (InputStream is = pFileS3Obj.getS3ObjectInputStream())
    {
        source = PDDocument.load(is);
    }
    catch (IOException)
    {
        // do something
    }
    try
    {
        appendDocument(destination, source);
    }
    catch (Exception e)
    {
        // do something
    }
    finally
    {
        source.close();
    }
}
  • Related