I'm new in Java I hope someone can help me solve this problem.
So I have a data set that contains a collection of words, and it keeps growing bigger and bigger. I don't want duplicate words, so I'm using this code to check if the words have been added and if not, it will added to a new lists.
for(int i = 0; i < rawWords.size(); i ) {
String word = rawWords.get(i);
if(!words.contains(word)) {
words.add(word);
wordsToExport.add(word);
}
}
But the problem is that when the word increases my program starts to slow down. Is there any solution for this problem? or maybe there is an error in my code?
CodePudding user response:
If your Collection
should not contain duplicates then you should specify a Set
.
Set<String> wordSet = new HashSet<>(words);
wordSet.addAll(rawWords);
The typical option is a HashSet
. This assumes that your data objects have implemented hashCode
and their hashCode
and equals
methods are consistent. Since you are working with String
you do not have to do anything since this class obeys those requirements.
If you require some sort of ordering to your Collection then consider TreeSet
or LinkedHashSet
depending on your use case. Search for information on the Java Collections framework for more details.
CodePudding user response:
You have added your words twice in the words list and wordsToExport, which might slow down things. If you want to avoid duplicated words, you can simply use Set implementation. Example implementation which will print [word1, word3, word2] in the console:
public static void main(String[] args) {
List<String> rawWords = Arrays.asList("word1", "word2", "word3", "word1", "word2");
Set<String> words = new HashSet<>();
Set<String> wordsToExport = new HashSet<>();
for (String word : rawWords) {
words.add(word);
wordsToExport.add(word);
}
System.out.println(words);
}