I want to make a function that takes in a list of strings as an input and, for a given word, returns a tuple containing the string with the most mentions of the given word and the amount of mentions in the string. If multiple strings all have the same max mentions of the word, then the first occurring one out of these strings is returned. The word is not case-sensitive.
For example, consider the list:
Tomatoes = ['tonight tomatoes grow towards the torchlit tower',
'the birds fly to the sky',
'to take the fish to the sea and to tell the tale',
'to fly to the skies and to taste the clouds']
Note that lines 3 and 4 have the most mentions of the word 'to'
.When we put tomatoes into the function with the searchword, ‘to’
, it should look like this:
most_word_mentions(Tomatoes, ‘to’)
And it should return the third line in a string and the amount of mentions of ‘to’ as a tuple which should look like (3, 3)
. Although line 3 shares the same amount of word mentions as line 4, it is returned because it occurs first in the list.
I have created a function that partially achieves what I want, however it fails under specific conditions.
def most_word_mentions(message, word):
wordcount = []
for i in range(len(message)):
message[i] = message[i].lower() #word is not case sensitive
wordcount.append(((message[i]).count(word)))
return (wordcount.index(max(wordcount)) 1), max(wordcount)
If we input most_word_mentions(Tomatoes, ‘to’)
, then the function fails to output the correct lines and word mentions. Instead, it returns (1, 6)
. This is because line 1, although it does not contain the explicit word ‘to’, contains many other words with ‘to’ in them. I would like to write a function that accounts for this issue, and that can be applied to similar scenarios. Could this be done with only for loops and if statements without list comprehension or imports?
CodePudding user response:
This solution uses just for loops and if-statements like you required.
def most_word_mentions(list_of_strings, word):
word = word.lower()
highest_count = 0
earliest_index = 0
for i in range(len(list_of_strings)):
curr_count = 0
if word in list_of_strings[i].lower():
curr_count = list_of_strings[i].lower().count(word)
if curr_count > highest_count:
if i > earliest_index:
earliest_index = i
highest_count = curr_count
return (earliest_index 1, highest_count)
print(most_word_mentions(Tomatoes, 'to'))
Output:
(3,3)
CodePudding user response:
Here is one I started on, which follows a slightly different idea:
def mostCommon(word,sentences):
sentencecount={} # keep track of sentences and occurance in sentence
for item in sentences: #iterate through sentences
lowcasesentence=item.lower().split() #make sentence lowercase, and split so that there is a list with each word of the sentence
sentencecount[item]=lowcasesentence.count(word) #call the method "count", which counts all occurances in a list. Append that to sentencecount
return(sentencecount) # return sentences as a dictionary with count as their value.
Instead of iterating through each letter combination, you should split the sentence into words and count that. Note that my function gives all the sentences back, not just the one with the mostest word count.
CodePudding user response:
You may try this.
def most_word_mentions(message, word):
word = word.lower()
line = 0
most = 0
for i in range(len(message)):
count = 0
for w in message[i].lower().split():
if word == w:
count = 1
if count > most:
line = i 1
most = count
return (line, most)