(Python) My task is to create a program that gathers an input() and puts it into a dictionary. For each word of the text it counts the number of its occurrences before it. My code:
text = input()
words = {}
for word in text:
if word not in words:
words[word] = 0
print(words[word])
elif word in words:
words[word] = words[word] 1
print(words[word])
An example input could be:
one two one two three two four three
The correct output should be:
0
0
1
1
0
2
0
1
My code however counts the occurrence of every character, instead of every word making the output way too long. How do I make it differentiate between word and character?
CodePudding user response:
That is because text
is a string and iterating over a string iterates through characters. You can use for word in text.split()
, this will split the string into a list. By default, it does the split on whitespaces, so it will split it into a list of words here.
CodePudding user response:
Given your example input, you would need to split text
on whitespace in order to get words. In general, the problem of splitting arbitrary text into words/tokens is non-trivial; there are a lot of natural language processing libraries purpose built for this.
Also, for counting things, the Counter
class from the built-in collections module is very useful.
from collections import Counter
text = input()
word_counts = Counter(w for w in text.split())
print(word_counts.most_common())
Output
[('two', 3), ('one', 2), ('three', 2), ('four', 1)]
CodePudding user response:
You are looking for the function split from the String type: https://docs.python.org/3/library/stdtypes.html?highlight=str split#str.split
Use it to create an array of words:
splitted_text = text.split()
The full example will look like:
text = 'this is an example and this is nice'
splitted_text = text.split()
words = {}
for word in splitted_text:
if word not in words:
words[word] = 0
elif word in words:
words[word] = words[word] 1
print(words)
Which will output:
{'this': 1, 'is': 1, 'an': 0, 'example': 0, 'and': 0, 'nice': 0}