Home > other >  PDFminer3k. An error occurred when convert PDF to TXT pdfminer pdfparser. PDFSyntaxError: Invalid ob
PDFminer3k. An error occurred when convert PDF to TXT pdfminer pdfparser. PDFSyntaxError: Invalid ob

Time:12-28

an error occurred when PDFminer3k convert PDF to TXT, strives for the great god solution!

"C: \ Program Files \ Python37 \ python exe" D:/python/PythonWS/0702/0702. Py
WARNING: the root: Wrong type: & lt; PDFStream (3) : raw=278, {' Type ':/Metadata,' Subtype:/XML, 'Length: 278,' Filter ':/FlateDecode} & gt; Required: & lt; The class 'dict & gt;
WARNING: the root: always locate objid=221
Mark
Traceback (the most recent call last) :
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdfparser py", line 377, in _getobj
Obj=objs [I]
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (the most recent call last) :
The File "D:/PYTHON/PythonWS/0702/0702. Py", line 51, the in & lt; module>
ReadPDF (path, toPath)
File "D:/PYTHON/PythonWS/0702/0702. Py", 39, the line in readPDF
For page in pdfFile. Get_pages () :
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdfparser py", line 568, in get_pages
For (pageid, tree) in the search (self. The catalog [' Pages'], the self. The catalog) :
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdfparser py", line 552, in the search
Tree=dict_value (obj, strict=True), copy ()
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdftypes py", line 92, in typecheck_value
X=resolve1 (x)
File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdftypes py", 58, line in resolve1
X=x.r esolve ()
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdftypes py", line 47, resolve in
Return the self. Doc. Getobj (self) objid)
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdfparser py", line 532, in getobj
Result=self. _getobj (objid)
The File "C: \ Program Files \ Python37 \ lib \ site - packages \ pdfminer \ pdfparser py", line 379, in _getobj
Raise PDFSyntaxError (' Invalid object number: objid=% r '% (objid))
pdfminer. Pdfparser. PDFSyntaxError: Invalid object number: objid=2

Process finished with exit code 1

CodePudding user response:

The program code to see: https://blog.csdn.net/Jfirm7/article/details/79941233

CodePudding user response:

Is this problem solved, please, meet the same problem?
  • Related