I have the following folder structure:
root
│ file001.docx
│ file002.docx
│
└───folder1
│ file003.docx
│ file004.docx
│
└───subfolder1
│ file005.docx
│ file006.docx
|____subfolder2
|
|_file007.docx
I wish to create a program where when someone types their root directory and a keyword, the file will show up. for example: if I input "hello there!", file007.docx
will show up (assume the text "hello there!" is contained in file007.docx
) and let the user know the typed words is in the word doc.
To approach this, I made a list of all the word documents inside the folders and sub folders by using this code:
def find_doc():
variable= input('What is your directory?') #asking for root directory
os.chdir(variable)
files = []
for dirpath, dirnames, filenames in os.walk(variable):
for filename in [f for f in filenames if f.endswith(".docx")]:
files.append(filename)
return files
Now, this is the second code for finding the contents in each word document:
all_files= find_doc() # just calling the first function I just made
while True:
keyword= input('Input your word or type in Terminate to exit: ')
for i in range(len(all_files)):
text = docx2txt.process(all_files[i])
if keyword.lower() in text.lower(): #to make it case insensitive
print ((all_files[i]))
if keyword== ('Terminate') or keyword== ('terminate'):
break
Theoretically, If I inputted the word "hello", within the input: input('Input your word or type in Terminate to exit: ')
, I should be able to retrieve file007.docx
because all_files= find_doc()
output
['file001.docx',
'file002.docx',
'file003.docx',
'file004.docx',
'file005.docx',
'file006.docx',
'file007.docx',]
Due to os.walk()
's recursive nature.
However, it threw me an error: FileNotFoundError: [Errno 2] No such file or directory:
I was wondering where I went wrong? Thanks!
CodePudding user response:
I think you want to modify your function into something like this to store the filenames with their associated path.
def find_doc():
variable= input('What is your directory?') #asking for root directory
os.chdir(variable)
files = []
for dirpath, dirnames, filenames in os.walk(variable):
for filename in [f for f in filenames if f.endswith(".docx")]:
files.append(os.path.join(dirpath, filename))
return files
You should also change your while loop so that your if statement gets checked prior to running the for loop.
while True:
keyword= input('Input your word or type in Terminate to exit: ')
if keyword.lower() == 'terminate':
break
else:
for i in range(len(all_files)):
text = docx2txt.process(all_files[i])
if keyword.lower() in text.lower(): #to make it case insensitive
print ((all_files[i]))