Home > Back-end >  Looping through folder to read pdf files
Looping through folder to read pdf files

Time:03-04

So I've this code which worked file for the individual pdf file. but is not working when i added loop. as I wanted to loop through multiple pdf files in folder --> subfolders -->pdf files. In path I'm not giving subfolders.

import fitz
import os
path = "/users/folder"

for i in os.listdir(path):
    if i.endswith(".pdf"):
       with fitz.open(path) as doc:
          text = ""
          for page in doc:
              text  = page.getText().strip()
          return text

CodePudding user response:

You are trying to open "path" variable as a file. Try this:

import fitz
import os
path = "/users/folder"

for i in os.listdir(path):
    if i.endswith(".pdf"):
       with fitz.open(path   "/"   i) as doc:
          text = ""
          for page in doc:
              text  = page.getText().strip()
          return text

CodePudding user response:

import fitz
import os

path = "/users/folder"
for root, dirs, files in os.walk(path): 
    # root will initially = path
    # Next loop root will become the next subdirectory which is found
    for i in files: 
        # All files in the current root will be checked
        if i.endswith(".pdf"):
            with fitz.open(path) as doc:
                text = ""
                for page in doc:
                    text  = page.getText().strip()
                # return text # There is no function here from which to return

  • Related