I had a question about the word similarity between the two sentences. example: Jack go to basketball. Jack go to basketball match I want to know the code to find the similarity ratio by dividing the number of words that are the same by the number of words of the long sentence. Which library should I use for this, thank you.
There are many kinds of similarity as similarity. I want to know which similarity title this similarity belongs to.
CodePudding user response:
You can use some document similarly techniques like Cosine similariy
Here I have implemented a solution based on your description.
double findSimilarityRatio (String sentence1, String sentence2) {
HashMap<String, Integer> firstSentenceMap = new HashMap<>();
HashMap<String, Integer> secondSentenceMap = new HashMap<>();
String[] firstSentenceWords = sentence1.split(" ");
String[] secondSentenceWords = sentence2.split(" ");
for (String word : firstSentenceWords) {
if (firstSentenceMap.containsKey(word)) {
firstSentenceMap.put(word, firstSentenceMap.get(word) 1);
}
else {
firstSentenceMap.put(word, 1);
}
}
for (String word : secondSentenceWords) {
if (secondSentenceMap.containsKey(word)) {
secondSentenceMap.put(word, secondSentenceMap.get(word) 1);
}
else {
secondSentenceMap.put(word, 1);
}
}
double totalWords = 0;
double totalHits = 0;
if (firstSentenceWords.length >= secondSentenceWords.length) {
totalWords = firstSentenceWords.length;
for (Map.Entry<String, Integer> entry : firstSentenceMap.entrySet()) {
String key = entry.getKey();
if (secondSentenceMap.containsKey(key)) {
totalHits = totalHits Math.min(secondSentenceMap.get(key), firstSentenceMap.get(key));
}
}
}
else {
totalWords = secondSentenceWords.length;
for (Map.Entry<String, Integer> entry : secondSentenceMap.entrySet()) {
String key = entry.getKey();
if (firstSentenceMap.containsKey(key)) {
totalHits = totalHits Math.min(secondSentenceMap.get(key), firstSentenceMap.get(key));
}
}
}
return totalHits/totalWords;
}
Hope it helps, cheers!