I have a java TreeMap<String, Integer> (let's call it multiset for now, because that is what I am using it for), and I want it to be sorted by the string's length. What I mean is that for some code like this:
// Imports
import java.util.Comparator;
import java.util.TreeMap;
class CompareLength implements Comparator<String> {
public int compare(String string, String string_2) {
return Integer.compare(string.length(), string_2.length());
}
}
// More lines of code ...
TreeMap<String, Integer> multiset = new TreeMap<>(new CompareLength());
multiset.put("abc", 1);
multiset.put("x", 1);
multiset.put("yz", 1);
for (String string: multiset.keySet()) {
System.out.print(string " ");
}
System.out.println(multiset.containsKey("xyz") " ");
The output is: x yz abc true
I wanted it to be sorted by length, not completely change the comparator of the TreeMap! How do I do this? I also want this to work no matter what the comparator is, the key type, the keys, the value, the value type, etc.
CodePudding user response:
As you realized, the Comparator
passed to a TreeMap
doesn't just specify the order of the entries, but also which entries are considered equivalent (i.e. those for which Comparator.compare()
returns 0).
The conclusion is that you simply shouldn't return 0 for any two string values, unless they are indeed identical. A simple way is to use a Comparator like this:
class CompareLength implements Comparator<String> {
public int compare(String str1, String str2) {
int result = Integer.compare(str1.legnth(), str2.length());
if (result == 0) {
result = str1.compareTo(str2);
}
return result;
}
}
CodePudding user response:
The Answer by Sauer is correct.
Here is a neater version of that code.
The Comparator
class has handy methods for creating a Comparator
implementation using a lambda or a method reference. Here we use both.
We do not use the String#length
method. That method is based on the char
type. The char
type has been essentially broken since Java 2, and legacy since Java 5. As a 16-bit value, char
is physically incapable of representing most characters. As a result, String#length
fails if the string contains any of the characters beyond the BMP in Unicode, that is, fails with any of the majority of characters. Instead, use code point integer numbers when working with individual characters in Java.
We declare our map to be the more general interface NavigableMap
rather than the concrete class TreeMap
.
NavigableMap < String, Integer > map =
new TreeMap<>(
Comparator
.comparing( ( String s ) -> s.codePoints().count() )
.thenComparing( String :: compareTo )
);
Define some sample data. We add a couple more entries than you had to show multiple keys with the same string length.
The Map.of
methods are a short sweet way to define an unmodifiable map in a literals fashion. We pass that map as an argument.
map.putAll(
Map.of (
"abc", 1 ,
"x", 2 ,
"b" , 3 ,
"a" , 4 ,
"yz", 5
)
) ;
Dump to console.
System.out.println( map ) ;
See this code run live at Ideone.com.
{a=4, b=3, x=2, yz=5, abc=1}