Home > Software engineering >  Word frequencies
Word frequencies

Time:07-28

I'm writing a Java program that calculates the frequencies of words in an input. The initial integer indicates how many words will follow. Here is my code so far:

import java.util.Scanner; 
public class LabProgram {
   public static void main(String[] args) {

      Scanner scnr = new Scanner(System.in);
      int numberWords = scnr.nextInt();
      String[] wordsList = new String[numberWords];
      int i;
      int j;
      int[] frequency = new int[numberWords];
      

      for (i = 0; i < numberWords;   i) {
         wordsList[i] = scnr.next();
         frequency[i] = 0;
         for (j = 0; j < numberWords;   j) {
            if (wordsList[i].equals(wordsList[j])) {
               frequency[i] = frequency[i]   1;
            }
         }
      }
      for (i = 0; i < numberWords;   i) {
         System.out.print(wordsList[i]   " - "   frequency[i]);
         System.out.print("\n");
      }
      
   }
}

When I input the following:

6 pickle test rick Pickle test pickle

This is the output:

pickle - 1
test - 1
rick - 1
Pickle - 1
test - 2
pickle - 2

However, this is the expected output:

pickle - 2
test - 2
rick - 1
Pickle - 1
test - 2
pickle - 2

It looks like it's picking up the correct frequencies for later occurrences, but not for the initial occurrences.

CodePudding user response:

For such a scenario you, either use a Map, where you keep the frequency for each word, or you can even do it with a stream and a group by. You don't even need to know the number of words in advance, assuming you simply split them by a space.

With a stream it's basically a oneliner:

String input = "pickle test rick Pickle test pickle";
// With a stream:
Map<String, Long> result = Arrays.stream(input.split(" ")).collect(Collectors.groupingBy(s->s, Collectors.counting()));

The map contains:

{Pickle=1, test=2, rick=1, pickle=2}

If you don't like streams, simply iterate manually over the words, and increment the value of the word (which is used as key in your map).

CodePudding user response:

I would use a Map for this. A map can be used to store a value for a given key. You can use your words as keys and their counts as values. Using a Map makes your code easier to understand, shorter, and less prone to errors. For example:

import java.util.Scanner;
import java.util.Map;
import java.util.HashMap;

public class MyClass {
    public static void main(String args[]) {
        HashMap<String, Integer> frequencies = new HashMap<>();
        Scanner s = new Scanner(System.in);

        int wordCount = s.nextInt();
  
        for (int i = 0; i < wordCount;   i) {
            String word = s.next();
            int count = frequencies.getOrDefault(word, 0);
            frequencies.put(word, count   1);
        }
  
        for (Map.Entry<String, Integer> item : frequencies.entrySet()) {
            System.out.println(item.getKey()   ": "   item.getValue());
        }
    }
}

CodePudding user response:

Creating a HashMap containing frequencies of each word is the most performant approach.

One of the ways to do it is by using method merge(), that was introduced in the Map interface with Java 8. It expects three arguments: a key, a value to be associated with that key (if it was not present in the map) and a function that would be evaluated if the already exists, and we need to merge the previous value and a new one.

public static void main(String[] args) {
    Scanner scanner = new Scanner(System.in);
    int numberWords = scanner.nextInt();
    
    Map<String, Integer> frequencies = new HashMap<>();
    for (int i = 0; i < numberWords; i  ) {
        frequencies.merge(scanner.next(), 1, Integer::sum);
    }
    
    frequencies.forEach((k, v) -> System.out.println(k   " -> "   v));
}

Output:

Pickle -> 1
test -> 2
rick -> 1
pickle -> 2

In case if you're comfortable with Stream API, you can approach this problem using built-in collector toMap() (note that we can read the input directly from the stream):

public static void main(String[] args) {
    Scanner scanner = new Scanner(System.in);
    
    Map<String, Integer> frequencies = IntStream.range(0,scanner.nextInt())
        .mapToObj(i -> scanner.next())
        .collect(Collectors.toMap(
            Function.identity(),
            i -> 1,
            Integer::sum
        ));
    
    frequencies.forEach((k, v) -> System.out.println(k   " -> "   v));
}

Output:

Pickle -> 1
test -> 2
rick -> 1
pickle -> 2

CodePudding user response:

the problem is in this instruction:

if (wordsList[i].equals(wordsList[j]))

wordsList[j] is null if j > i. This happens because you scan the words in external for, so in the inner for you don't know the words at indexes > i.

Because of that, when you perform the comparison, the second word results as null for indexes j > i.

  • Related