file=open('ssss.txt','r')
dict1=dict()
for line in file:
line = line.strip()
line = line.lower()
words = line.split(" ")
for word in words:
if word in dict1:
dict1[word]=dict1[word] 1
else:
dict1[word]=1
#print(sorted(dict1.items(), key=lambda kv:kv[1]))
print(dict1)
print("The top 10 words with maximum occurence are:")
this is txt file txt file output output
I want it to return a dictionary with all words not just the last line
CodePudding user response:
for line in file:
line = line.strip()
line = line.lower()
words = line.split(" ")
for word in words:
The second for
will only run after the first one. By then, words
is left with what was on the last line. You probably want:
file=open('ssss.txt','r')
dict1=dict()
for line in file:
line = line.strip()
line = line.lower()
words = line.split(" ")
for word in words:
if word in dict1:
dict1[word]=dict1[word] 1
else:
dict1[word]=1
print(dict1)
print("The top 10 words with maximum occurence are:")
Notice, now, the second loop is nested under the first one. So, the processing of words will happen for every line.
CodePudding user response:
You are reading every line. You are only saving the words from the last line, though, with words = line.split(" ")
. You need to accumulate the words from every line in a list.
from collections import Counter
with open('ssss.txt') as file:
words = []
for line in file:
line = line.strip().lower()
words.extend(line.split(" "))
# words = [word for line in file for word in line.strip().lower().split(" ")]
d = Counter(words)
Or, as @Jeffrey shows, update the counter as you read each line. Adapted to use collections.Counter
, it might look like
from collections import Counter
d = Counter()
with open('ssss.txt') as file:
for line in file:
line = line.strip().lower()
d.update(line.split(" "))