Home > database >  Broken PDF when forcedownloaded via AJAX
Broken PDF when forcedownloaded via AJAX

Time:12-18

I'm using an API call (cURL, server-side) to receive an XML file. The XML file has a base64 encoded PDF embedded within, and I'm extracting this file from the XML, and then saving it on the server for later downloading.

I'm doing it like this:

<?php

$pdfContents = $content["DocumentHeader"]["DocumentPdf"]; // base64 string PDF
$endDocument = base64_decode($pdfContents);
$realName = 'some_random_string.pdf';
$filename = 'adjusted_name.pdf';
file_put_contents($realName,$endDocument);

header('Content-Description: File Transfer');
header('Content-type: application/octetstream');
header('Content-Disposition: attachment; filename='.$filename);
header('Connection: Keep-Alive');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Transfer-Encoding: binary');
readfile($realName);

unlink($realName);

die();
?>

The front-end is receiving the file like so:

$.ajax({
    url: 'get-file.php',
    type: 'GET',
    data: {},
    success:function(data,status,xhr) {
        var blob = new Blob([data], { type:'application/pdf' });
        var downloadUrl = URL.createObjectURL(blob);

        window.location.href = downloadUrl;
    }
});

While this does download a PDF file, that file is broken. By broken, I mean that the contents are not entirely valid.

See the following image. On the right is the broken file, and on the left is what the file should display, when open in, say, Sublime Text 3 (or any other text editor).

PDF contents comparison

I'm guessing I'm doing something wrong with the way I'm creating the Blob, or (not) processing the data, but I'm unsure what.

Here are some of the things I've tried:

  1. Changing the header('Content-type') to application/pdf
  2. Removing / commenting out the header('Content-Transfer-Encoding')
  3. Using the TextEncoder and TextDecoder
  4. Changing the front-end JS handling of data to:
var dataArray = new Uint8Array(data.length);
for (var i = 0; i < data.length; i  ) {
    dataArray[i] = data.charCodeAt(i);
}
var blob = new Blob([dataArray], { type:'application/pdf' });

You might ask why am I not simply passing the link to the PDF, and simulating a click on a hidden <a href>, or why am I not setting the responseType to blob, or why am I even using AJAX to begin with.

The answer to the first question is twofold - there are multiple simultaneous users (in the hundreds), and I'd like to avoid them knowing the real filename, and also to avoid keeping that file on the server for an indefinite period of time.

The second issue (responseType) - my response is not always a PDF file. It might be a JSON, containing a message for the user - the XML was not downloaded, there was an error while retrieving the parameters needed for the XML API call, etc, etc.

The third issue - I want this to work without the need for target blanks and wondering whether the user's browser will block the opening of a new tab, and I also want to keep the user on the page from which they initiated the download etc.

Finally, my question.

What do I need to change in the Blob creating part of my code, in order to get a functional PDF? I'm guessing it's an encoding issue, but I'm unsure how to go about fixing it.


EDIT

I've experimented with the conversion to blob, and after changing it to this:

    var byteString = data;
    var arrayBuffer = new ArrayBuffer(byteString.length);
    var intArray = new Uint8Array(arrayBuffer);
    for (var i = 0; i < byteString.length; i  ) {
        intArray[i] = byteString.charCodeAt(i);
    }

    var blob = new Blob([intArray], { type:'application/pdf' });

I still get a broken PDF, but with a slightly different encoding issue, like in the following image.

ArrayBuffer conversion

CodePudding user response:

Managed to resolve the issue.

Despite my initial doubts, the problem was indeed on the server. Specifically, this part:

    header('Content-Transfer-Encoding: binary');
    readfile($realName);

I was sending the stream, which made it impossible for the front-end logic to convert it correctly to what I needed. The thing that gave it away (apart from the helpful comments) was logging the contents of xhr.responseText to the console - instead of the expected format of the PDF contents, I noticed that there were diamonds / improperly converted characters.

With that in mind, I changed my code like this:

<?php

$pdfContents = $content["DocumentHeader"]["DocumentPdf"]; // base64 string PDF
//$endDocument = base64_decode($pdfContents);  <== not needed in this case
//$realName = 'some_random_string.pdf';  <== not needed in this case
$filename = 'adjusted_name.pdf';
//file_put_contents($realName,$endDocument); <== not needed in this case

header('Content-Description: File Transfer');
header('Content-type: application/octetstream');
header('Content-Disposition: attachment; filename='.$filename);
header('Connection: Keep-Alive');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
//header('Content-Transfer-Encoding: binary');   <== not good in this case
//readfile($realName);   <== not good in this case
echo $pdfContents; // <== base64 representation of the raw PDF contents
unlink($realName);

die();
?>

Then, in the front-end code, I made additional changes:

$.ajax({
    url: 'get-file.php',
    type: 'GET',
    data: {},
    success:function(data,status,xhr) {
        //var blob = new Blob([data], { type:'application/pdf' });

        var byteString = atob(data);
        var intArray = new Uint8Array(byteString.length);
        for (var i = 0; i < byteString.length; i  ) {
            intArray[i] = byteString.charCodeAt(i);
        }

        var blob = new Blob([intArray], { type:'application/pdf' });
        var downloadUrl = URL.createObjectURL(blob);    
        window.location.href = downloadUrl;
    }
});

This forces the download of the received PDF, which looks like expected upon opening.

  • Related