Background

I'm working on a HackerRank problem Word Order. The task is to

Read the following input from stdin
```
4
bcdef
abcdefg
bcde
bcdef
```
Produce the output that reflects:
- Number of unique words in first line
- Count of occurrences for each unique words Example:
```
3       # Number of unique words
2 1 1   # count of occurring words, 'bcdef' appears twice = 2
```

Problem

I've coded two solutions, the second one passes initial tests but fail due to exceeding time limit. First one would also work but I was unnecessarily sorting outputs (time limit issue would occur though).

Notes

In first solution I was unnecessarily sorting values, this is fixed in the second solution
I'm keen to be making better (proper) use of standard Python data structures, list/dictionary comprehension - I would be particularly keen to receive a solution that doesn't import any addittional modules, with exception of import os if needed.

Code

import os

def word_order(words):
    # Output no of distinct words
    distinct_words = set(words)
    n_distinct_words = len(distinct_words)
    print(str(n_distinct_words))
    
    # Count occurrences of each word
    occurrences = []
    
    for distinct_word in distinct_words:
        n_word_appearances = 0
        for word in words:
            if word == distinct_word:
                n_word_appearances  = 1
        occurrences.append(n_word_appearances)
    occurrences.sort(reverse=True)
    print(*occurrences, sep=' ')
    # for o in occurrences:
    #     print(o, end=' ')
    
def word_order_two(words):
    '''
    Run through all words and only count multiple occurrences, do the maths
    to calculate unique words, etc. Attempt to construct a dictionary to make
    the operation more memory efficient.
    '''
    # Construct a count of word occurrences
    dictionary_words = {word:words.count(word) for word in words}
    
    # Unique words are equivalent to dictionary keys
    unique_words = len(dictionary_words)
    
    # Obtain sorted dictionary values
    # sorted_values = sorted(dictionary_words.values(), reverse=True)
    result_values = " ".join(str(value) for value in dictionary_words.values())
    # Output results
    print(str(unique_words))
    print(result_values)
    return 0

if __name__ == '__main__':
    q = int(input().strip())

    inputs = []
    for q_itr in range(q):
        s = input()
        inputs.append(s)
        
    # word_order(words=inputs)
    word_order_two(words=inputs)

CodePudding user response：

Without any imports, you could count unique elements by

len(set(words))

and count their occurrences by

def counter(words):
    count = dict()
    for word in words:
        if word in count:
            count[word]  = 1
        else:
            count[word] = 1
    return count.values()

CodePudding user response：

Those nested loops are very bad performance wise (they make your algorithm quadratic) and quite unnecessary. You can get all counts in single iteration. You could use a plain dict or the dedicated collections.Counter:

from collections import Counter

def word_order(words):
    c = Counter(words)
    print(len(c))
    print(" ".join(str(v) for _, v in c.most_common()))

The "manual" implementation that shows the workings of the Counter and its methods:

def word_order(words):
    c = {}
    for word in words:
        c[word] = c.get(word, 0)   1
    print(len(c))
    print(" ".join(str(v) for v in sorted(c.values(), reverse=True)))
    # print(" ".join(map(str, sorted(c.values(), reverse=True))))

CodePudding user response：

You can use Counter then print output like below:

>>> from collections import Counter
>>> def counter_words(words):
...   cnt = Counter(words)
...   print(len(cnt))
...   print(*[str(v) for k,v in c.items()] , sep=' ')

>>> inputs = ['bcdef' , 'abcdefg' , 'bcde' , 'bcdef']
>>> counter_words(inputs)
3
2 1 1