Introduction:
I'm currently building a keyword detection program. It is given a number of '.txt' files and loops through them, searching for a keyword in them from a list of keywords, returning which files contained the keyword. The keywords are stored in a list in a separate python file, which is then imported into the main program file.
Goal:
The goal I want to achieve is to print which keyword was found out of the list when it parses the text file. So for example when it searches the text files and "Hello" is in the keyword list, I want the output to be "Hello, found in example_text01.txt". At the moment it just returns if a keyword was found or not. Ideally the process should look like what is below.
Example Wordlist:
word_list = ["Demo", "Text", "Hello", "Example"]
Example Text:
Hello how are you?
Desired Outcome:
"Hello, found in example_text01.txt"
What I Have Tried:
- Tried to use the
in
keyword.
Ran without errors but it would skip any text file with a keyword and not process it.
- Make keyword file plain text and use
readline()
to parse the text.
Received the following error: AttributeError: 'list' object has no attribute 'readlines'
- Returning
keyword
class when writing the result document.
Just returned <class 'ast.keyword'>
Code:
The following is the code I am currently using.
keywords = ['Hello', 'Example', 'Keywords']
# Create and open result.txt where results of keyword scan will be stored
with open("/PATH/TO/result.txt", "w") as f:
#Path to the folder the .txt files are stored in within the loop
for filename in listdir("/PATH/TO/txt"):
# Opens all text files as they are processed through the loop
with open('/PATH/TO/CURRENT/TEXT/FILE/IN/txt/example.txt') as currentFile:
text = currentFile.read()
if any(keyword in text for keyword in keywords):
f.write('Keyword found in ' filename[:-4] '\n')
else:
f.write('No keyword in ' filename[:-4] '\n')
The current output of code is if a keyword(s) from the keyword list is found in one of the text files then the program will write to the 'results.txt' file if a keyword is found or not. However along with it, I would like to find a way to include which keyword was found. Any help would be greatly appreciated, thanks!
CodePudding user response:
Just change:
if any(keyword in text for keyword in keywords):
f.write('Keyword found in ' filename[:-4] '\n')
else:
f.write('No keyword in ' filename[:-4] '\n')
to:
keywordsFound = [k for k in keywords if k in text] #get all found keywords
if keywordsFound: #if keywords were found
for k in keywordsFound:#for each found keyword
f.write(f'{k}, found in {filename[:-4]}\n') #say it was found
else:
f.write(f'No keyword in {filename[:-4]}\n') #if non-found say it was not found
This gets each keyword that is found in the file then writes to the other file.
If you want only the first keyword that is found you can use:
keywordsFound = [k for k in keywords if k in text] #get all found keywords
if keywordsFound: #if keywords were found
k = keywordsFound[0] #get only first keyword
f.write(f'{k}, found in {filename[:-4]}\n') #say it was found
else:
f.write(f'No keyword in {filename[:-4]}\n') #if non-found say it was not found
CodePudding user response:
Why not just modify the bottom part:
Instead of
if any(keyword in text for keyword in keywords):
f.write('Keyword found in ' filename[:-4] '\n')
else:
f.write('No keyword in ' filename[:-4] '\n')
...
for k in keywords:
f.write((f'Keyword "{k}" found in ' if keyword in text else 'No keyword in ') filename[:-4] '\n')