I've written a function that takes in a phrase or sentence and outputs a dictionary with each entry being a word and the associated key being the character count.
When I call the function with wordcount('hello my name is bobby')
it returns {'hello': {5}, 'my': {2}, 'name': {4}, 'is': {2}, 'bobby': {5}}
I'm not sure why the associated key has curly brackets?
I'd like it to return {'hello':5, 'my':2, 'name':4, 'is':2, 'bobby':5}
. I've tried many things but can't get it to work.
def wordcount(text):
from collections import defaultdict
d = defaultdict(set)
for word in text.split():
d[word].add(len(word))
return {a :b for a, b in d.items()}
CodePudding user response:
Each value in your dictionary is a set
, because you are using defaultdict(set)
to define it. Then you add the words' lengths to the corresponding sets. That's why you get the braces when you print them.
Try:
d = {}
...
d[word] = len(word)
CodePudding user response:
You are setting the default value of the defaultdict to "set" (https://docs.python.org/3/tutorial/datastructures.html#sets). That means: if there is no entry in the dict with that key, defaultdict adds a set. Afterwards, you add the length of the word to that set.
What you probably want does not require a default
def wordcount(text):
d = {}
for word in text.split():
d[word] = len(word)
return d
CodePudding user response:
The values have curly brackets because they are Set
s, which are lists of unique values.
Change these lines :
d = defaultdict(set)
for word in text.split():
d[word].add(len(word))
To :
d = defaultdict()
for word in text.split():
d[word] = len(word)
CodePudding user response:
The reason is that you are declaring in advance that the value of any key that doesn't already exist in the dictionary will be of the type set, and you are adding the length of the word to a null set, which further supports your declaration. Try the below code
def wordcount(text):
from collections import defaultdict
d = defaultdict(set) #I think you should change this to d={} or or to d=defaultdict{} or to d=defaultdict(int), as you clearly want integer association only.
for word in text.split():
d[word]=len(word) #assigns an integer value to the key
return {a :b for a, b in d.items()}
CodePudding user response:
Thanks for all the help guys. If im trying to change the input to a text file with multiple lines, how would i go about doing that? Heres where i'm at so far but it's not outputting correctly.
def wordcount(filename, delimiter=''):
'''
Python program to compute the word character count of each word from a sentence or phrase.
Positional Parameters:
filename (str) : Name of file
Keyword Parameters:
delimiter (str, default='') : Character that seperates words
Returns:
(dict) : Dictionary with keys of the words and values of the character count
'''
d = {}
with open(filename,'r') as data:
for line in data:
words = line.split(delimiter)
#assigns the wordcount to the value in the dictionary
for word in words:
d[word] = len(word)
return {a :b for a, b in d.items()}