I have a list of of words which I want to find in a text file.
Right now I'm trying the any
method to iterate through lines in a file.
It returns True
or False
correctly so it's working fine.
My question though is if it's possible to see which word it found? I can see with my code and the text which word was found but I would like to use it in the code somehow if it's possible.
An example of what I mean below. This code returns True
or False
if any of the words are in the line
.
list_of_words = ['apple', 'banana', 'lemon']
with open(file, 'r') as f:
lines = f.readlines()
for line in lines:
x = any(word in line for word in list_of_words)
print(x)
CodePudding user response:
You can use next
instead, with a default
in case no element is found.
x = next((word for word in list_of_words if word in line), None)
if x is not None:
...
If None
can be an element in the list, you may use some dedicated sentinel object instead, e.g.
not_in = object()
x = next((word for word in list_of_words if word in line), not_in)
if x is not not_in:
...
Or explicitly catch the StopIteration
error:
try:
x = next(word for word in list_of_words if word in line)
...
except StopIteration:
pass
Note that all of these approaches only give you the first such element, and then stop checking the rest (like any
does); if instead you are interested in all such elements, you should use a list comprehension as in the other answer.
CodePudding user response:
Instead of using any
, use a simple list comprehension as follows:
x = [word for word in list_of_words if word in line]
This will return a list of the words in that line
CodePudding user response:
Ideally,you want to use a set
instead of a list to hold the important words.
If you just want to know which words were found and don't care how many times they are found, then this will work:
words = set(['apple', 'banana', 'lemon'])
with open(file, 'r') as f:
lines = f.readlines()
for line in lines:
words_in_line = set(line.split()) # or however else you want to put them in a set.
for word in words_in_line.intersection(words):
print(word)
Borrowing a bit from another comment, an even better way to solve it is this:
words = set(['apple', 'banana', 'lemon'])
with open(file, 'r') as f:
lines = f.readlines()
for line in lines:
words_in_line = (word for word in line.split() if word in words)
for word in words_in_line:
print(word)
This makes use of the fact that a generator object, i.e. the object created by iterating over the words in the line is also an iterable, and thus, if it is empty the iteration will stop immediately, and if any elements exist, it will iterate over them.
CodePudding user response:
You can use assignment operator :=
:
list_of_words = ["apple", "banana", "lemon"]
sample_lines = [
"this is line 1",
"this line 2 has banana in it.",
"this is line 3",
]
for line in sample_lines:
if any((found_word := word) in line for word in list_of_words):
print(found_word)
Prints:
banana