Home > Mobile >  Not able to find number of pages of PDF using Python 3.X: DependencyError: PyCryptodome is required
Not able to find number of pages of PDF using Python 3.X: DependencyError: PyCryptodome is required

Time:08-30

I am performing data validation on files that I download from a url. One of those validation checks involves checking the number of pages of a PDF. Using PyPDF2 package and PdfFileReader module, this worked until I encountered a PDF with 256-bit AES encryption that has a permissions password but no document open password. I have no access to any passwords since these files are from manufacturer websites so I concluded that for now I can just check to see if the PDF is encrypted, and if it is, skip it for now, but regardless if I try to retrieve the page count or check if the PDF is encrypted, I get this error:

DependencyError: PyCryptodome is required for AES algorithm

This error occurs at line 6, the if statement.

This is despite having pycryptodome installed and the AES module imported. Also, I am using Jupyter Notebook. Here is my code:

! pip install PyPDF2
! pip install pycryptodome
from PyPDF2 import PdfFileReader
from Crypto.Cipher import AES

if PdfFileReader('Media Downloaded Files/spk-10-3144 bro.pdf').isEncrypted:
    print('This file is encrypted.')
else:
    print(PdfFileReader('Media Downloaded Files/spk-10-3144-bro.pdf').numPages)

Solution:

! pip install pikepdf 
from pikepdf import Pdf  
pdf = Pdf.open('Media Downloaded Files/spk-10-3144-bro.pdf') 
len(pdf.pages)

CodePudding user response:

I had a problem using PyPDF3 (it's a fork from PyPDF2) involving encryptation. I solved replacing it for pikepdf. It has more encryption algorithms implementations. Try it out!

  • Related