Home > Mobile >  How to save PDF from HTML with the same rendering in app script
How to save PDF from HTML with the same rendering in app script

Time:09-24

I need to save an HTML webpage as a pdf in my google Drive. I am using app script.
However, the PDF is rendered differently from the HTLM page. I need the PDF to be the same as the HTML.

Here is a print screen of the HTML

enter image description here

and here is my code:

function downloadFile() {

var fileURL = "http....." 
var folder = "primes"
var fileName = "";
var fileSize = "";

var response = UrlFetchApp.fetch(fileURL, {
    headers: { Authorization: 'Bearer '   ScriptApp.getOAuthToken() },
})
var htmlBody = response.getContentText();
var rc = response.getResponseCode();

if (rc == 200) {

var blob = Utilities.newBlob(htmlBody, 
MimeType.HTML).getAs('application/pdf').setName('Nota.pdf');
var folder = DriveApp.getFolderById("1gBA8YCs3PH7v7CNl3nlsjNqYzhOxhjYa");
if(folder != null){
  var file = folder.createFile(blob);
  fileName = file.getName()
  fileSize = file.getSize()
 }
}
var fileInfo = {'rc':rc,"filename": fileName, "fileSize":fileSize}

Logger.log(fileInfo)

 }

!!!!UPDATE!!!!!!

Here is the HTML for the web page:

enter image description here

I know it is possible to save this PDF correctly, because if a use a chrome extension called HTML TO PDF, it converts correctly as show in the following figure

enter image description here

CodePudding user response:

The best way to do this is via puppeteer. Puppeteer is a nodejs library which can launch a headless webbrowser.

const puppeteer = require('puppeteer')
// I found these sizes work best when outputting to A4 format
 const browser = await puppeteer.launch({
    headless: true,
    args: [`--window-size=794,1123`],
    defaultViewport: {
        width: 980,
        height: 1386
    }
 });
 const page = await browser.newPage();
// Convert the html string to a raw string to keep any special characters as plain text
const doc = String.raw`${YOUR STRING}`;
// I found that it works better if you first search a page and then set the content.
await page.goto("https://google.com")
// set the html content of the page
await page.setContent(doc);
let file = “File.pdf”`;
let path = “output path”;
// saves a pdf to the provided path
const pdf = await page.pdf({
    path: path, 
    scale: 0.8,
    displayHeaderFooter: true,
    printBackground: true,
    margin: {
        top: 80,
        bottom: 80,
        left: 30,
        right: 30
    }
});
await browser.close();

Pro: Gives a really clean output with selectable text Con: You have to run this code on a nodejs server for the best result

For more documentation on puppeteer view enter image description here

--headless --disable-gpu --enable-logging --print-to-pdf="%UserProfile%\Documents\Demofile.pdf" --no-margins --disable-extensions --print-to-pdf-no-header --disable-popup-blocking  --run-all-compositor-stages-before-draw --disable-checker-imaging  https://bling.com.br/doc.view.php?id=2565744b051b9f1d3177c30ccf1e65b2

PROs Exactly as HTML is rendered in a browser

CONs Exactly as HTML pages would be saved as PORTRAIT PDF from a browser (command line default is landscape=false) so to change layout you need another method.

  • Related