I just begin to learn Streams, and I have a task to count and sort all words in some array of String. I have already parsed my input to words, but I don't know how to add and update entry using the stream.
There is my parsing stream:
Stream<String> stringStream = lines.stream().flatMap(s -> Arrays.stream(s.split("[^a-zA-Z] ")));
String[] parsed = stringStream.toArray(String[]::new);
I have done this task without streams just like this:
Map<String,WordStatistics> wordToFrequencyMap = new HashMap<>();
for (String line: lines) {
line=line.toLowerCase();
String[] mas = line.split("[^a-zA-Z] ");
for (String word:mas) {
if(word.length()>3) {
if (!wordToFrequencyMap.containsKey(word)) {
wordToFrequencyMap.put(word, new WordStatistics(word, 1));
} else {
WordStatistics tmp = wordToFrequencyMap.get(word);
tmp.setFreq(tmp.getFreq() 1);
}
}
}
}
WordStatistics class:
public class WordStatistics implements Comparable<WordStatistics>{
private String word;
private int freq;
public WordStatistics(String word, int freq) {
this.word = word;
this.freq = freq;
}
public String getWord() {
return word;
}
public int getFreq() {
return freq;
}
public void setWord(String word) {
this.word = word;
}
public void setFreq(int freq) {
this.freq = freq;
}
@Override
public int compareTo(WordStatistics o) {
if(this.freq > o.freq)
return 1;
if(this.freq == o.freq)
{
return -this.word.compareTo(o.word);
}
return -1;
}
}
CodePudding user response:
A simple approach is collecting toMap()
with a merge function:
Map<String, WordStatistics> wordToFrequencyMap = lines.stream()
.map(s -> s.split("[^a-zA-Z] "))
.flatMap(Arrays::stream)
.collect(Collectors.toMap(w -> w, w -> new WordStatistics(w, 1), (ws1, ws2) -> {
ws1.setFreq(ws1.getFreq() ws2.getFreq());
return ws1;
}));
To break down the toMap()
arguments:
w -> w
just means use the stream element as the map key.- The next argument produces a value for the key, which is initially a new instance of
WordStatistics
with a frequency of1
. - Finally, we tell the collector how to merge values together when they belong to the same key. In our case, we sum the frequencies into one of the values (
ws1
) and return that as the merge result.
CodePudding user response:
This should do pretty much the same you are doing in your loop right now.
Pattern pattern = Pattern.compile("[^a-zA-Z] ");
lines.stream().flatMap(pattern::splitAsStream).filter(s -> s.length() > 3).forEach(s -> {
WordStatistics tmp = wordToFrequencyMap.get(s);
if (tmp == null) {
wordToFrequencyMap.put(s, new WordStatistics(word, 1));
} else {
tmp.setFreq(tmp.getFreq() 1);
}
});