Home > Blockchain >  in python: Converting pdf to png with python (without pdf2image)
in python: Converting pdf to png with python (without pdf2image)

Time:10-21

I want to convert a pdf (one page) into a png file. I installed pdf2image and get this error: ppopler is not installed in windows.

According to this question: Poppler in path for pdf2image

Poppler should be installed and PATH modified.

I can not do any of those (I dont have permissions in the system I am working with).

I had a look to opencv and PIL and none seems to offer the possibility to make this transformation.

PIL (see here https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html?highlight=pdf) does not offer the possibility to read pdfs, only to save images as pdfs. The same goes for openCV.

Any suggestion how to make the pdf to png transformation ? I can install any python library but I can not touch the windows installation.

thanks

CodePudding user response:

PyMuPDF supports pdf to image rasterization without requiring any external dependencies.

Sample code to do a basic pdf to png transformation:

import fitz  # PyMuPDF, imported as fitz for backward compatibility reasons
file_path = "my_file.pdf"
doc = fitz.open(fname)  # open document
for page in doc:
    pix = page.get_pixmap()  # render page to an image
    pix.save(f"page_{i}.png")

CodePudding user response:

Here is a snippet that generates PNG images of arbitrary resolution (dpi):

import fitz
file_path = "my_file.pdf"
dpi = 300  # choose desired dpi here
zoom = dpi / 72  # zoom factor, standard: 72 dpi
magnify = fitz.Matrix(zoom, zoom)  # magnifies in x, resp. y direction
doc = fitz.open(fname)  # open document
for page in doc:
    pix = page.get_pixmap(matrix=magnify)  # render page to an image
    pix.save(f"page-{page.number}.png")

Generates PNG files name page-0.png, page-1.png, ... By choosing dpi < 72 thumbnail page images would be created.

  • Related