Home > OS >  Calling range of dictionary in python
Calling range of dictionary in python

Time:11-05

I have a sorted dictionary (descending order) of words in a .txt file and their frequency. For example, {'the':1682}. I need to write code so that only the most frequent 20 words are printed (since they are already ordered, just the first 20 items). I am aware that dictionaries are ordered by insertion, however I am not sure how to leverage this tell python to print out first 20. Here is the code I have

def wordcount(book):
    single_list = []        
    unique = []    
    freq_dict = {}
 
    for word in wordlist:
        no_punc = word.strip(punctuation)
        lower_case = no_punc.lower()
        single_list.append(lower_case)
        unique = set(single_list)
    #num_unique = print(len(unique))
    for word in single_list:
        if word in freq_dict:
            freq_dict[word]  = 1 
        else:
            freq_dict[word] = 1 
    sorted_dict = dict(sorted(freq_dict.items(), key = lambda kv: kv[1], reverse = True))
    for w in sorted_dict:
        print(w, sorted_dict[w]) 
        
wordcount(book)

and the output is

the 1632
and 845
to 721
a 627
she 537
it 526
of 508
said 462
i 401
alice 386
in 367
you 362
was 357
that 276
as 262
her 248
at 210
on 193
with 180
all 180
had 178
but 166
for 153
so 150
be 146
very 144
not 144
what 136
this 134
little 128
they 127
he 120
out 113
is 102
down 101
one 101
up 98
his 96
about 94
if 94
then 90
no 87
know 86
like 85
were 85
them 84
would 83
went 83
herself 83
again 82
do 81
have 80
when 79
could 77
or 76
there 75
thought 74
off 73
time 68
me 68
queen 68

and so on for every word in the book (about 2800 words). So how do I get python to only print the first 20?

CodePudding user response:

There's no such thing as a sorted dictionary. Dictionaries generally hold things in order of insertion but this should not be relied on.

To preserve order, you need to use an OrderedDict.

from collections import OrderedDict
newDict = OrderedDict()
for k,v in sorted(freq_dict.items(),key = lambda kv: kv[1], reverse = True)):
   newDict[k] = v 

Then you can do something like:

for pos,(k,v) in enumerate(newDict.items()):
   if pos < 20:
       print(pos,k,v)

CodePudding user response:

You better use Counter from collections module

from collections import Counter 

and then pass wordlist to it:

Counter(wordlist.split())

CodePudding user response:

You can use itertools.islice(sorted_dict,20) to get an iterator of first 20 entries.

import itertools

def wordcount(book):
    single_list = []        
    unique = []    
    freq_dict = {}
 
    for word in wordlist:
        no_punc = word.strip(punctuation)
        lower_case = no_punc.lower()
        single_list.append(lower_case)
        unique = set(single_list)
    #num_unique = print(len(unique))
    for word in single_list:
        if word in freq_dict:
            freq_dict[word]  = 1 
        else:
            freq_dict[word] = 1 
    sorted_dict = dict(sorted(freq_dict.items(), key = lambda kv: kv[1], reverse = True))
    for w in itertools.islice(sorted_dict,20):
        print(w, sorted_dict[w])
        
wordcount(book)
  • Related