How can i change this reducer code to find the longest words (and the length) rather than it finding-CodePudding

REDUCER CODE This code finds the frequency of the words from a text file, and I would like to know how to change this to find the longest words in the text file and print them out eg. "The longest word has 13 characters. The result includes: "

import sys
    results = {}
    for line in sys.stdin:
        word, frequency = line.strip().split('\t', 1)
        results[word]=results.get(word,0)   int(frequency)
        words = list(results.keys())
        words.sort()
        for word in words:
            Print(word,results[word])

MAPPER CODE

import sys
for line in sys.stdin:
    for word in line.strip().split():
        print (word , "1")

CodePudding user response：

To build on my suggestion (loop through words, keep longest in variable):

longest = ""

for line in something:
    for word in line.lower().split():
        if len(word.strip()) > len(longest):
            longest = word.strip()

print("Longest word is:", longest, "with the length of:", len(longest))

CodePudding user response：

If you don't want to keep all words then you could do something like this:

longest = set()
max_length = 0
for line in sys.stdin:
    for word in line.strip().split():
        length = len(word)
        if length > max_length:
            max_length = length
            longest = {word}
        elif length == max_length:
            longest.add(word)

print(longest)

If you want to keep them, grouped by length, you could use a defaultdict:

from collections import defaultdict

words_length = defaultdict(set)
for line in sys.stdin:
    for word in line.strip().split():
        words_length[len(word)].add(word)

print(words_length[max(words_length)])