Home > Software design >  Can't compare strings by alphabetical order
Can't compare strings by alphabetical order

Time:10-21

My DB stores keys by alphabetical order:

-MmNI8oyb2QE_9V0WdaX \\1st (oldest)
-MmOAFDL9ZPD1gx4SjEU \\2nd
-MmPtIJ1LpFTRbweNWvD \\3rd
-MmPtd0IMuNIEYaPYPgZ \\4th (newest)

The sorting order on this list is correct you can check here that it's an alphabetical order. 1st string is the smallest, then 2nd, 3rd, and 4th.

I would like to compare every two strings from that list, and get the correct results like the order of that specific list, so if I compare the 4th string with the 3rd one, I'll receive a result that 4th is bigger than the 3rd.

What I tried so far: using string.Compare, however it doesn't work right for my specific list the results I received weren't consistent

string first="-MmNI8oyb2QE_9V0WdaX";
string second="-MmOAFDL9ZPD1gx4SjEU";
string third="-MmPtIJ1LpFTRbweNWvD";
string fourth="-MmPtd0IMuNIEYaPYPgZ";
string.Compare(third,fourth) //output: 1
string.Compare(second,third) //output: -1
string.Compare(first,second) //output: -1

(Output should all be either "1" or "-1". because my list is sorted) What function should I use instead of comparing the strings? I also tried string.Compare(3rd,4th,false) an overload which compares by the case but it didn't help, my guess is that it has to do something with the case

CodePudding user response:

It seems you actually want to sort these strings by their ordinal value (their binary representation). In that case use StringComparison.Ordinal as comparisonType.

 string.Compare(third, fourth, StringComparison.Ordinal)); //output: -27
 string.Compare(second, third, StringComparison.Ordinal)); //output: -1
 string.Compare(first, second, StringComparison.Ordinal)); //output: -1

Weirdly the first comparison yields -27 instead of -1. The Compare method only specifies that the return value will be <1, 0 or >1 so those three output are essentially (and semantically) the same result.

CodePudding user response:

This is an interesting phenomenon that is likely caused by .NET performing a word-sort comparison, using weighted sorting rules to keep similar words together (see remarks here).

It causes some interesting, if unexpected, results. Even when case-sensitivity is false:

string.Compare("A", "c", false); // -1
string.Compare("a", "c", false); // -1
string.Compare("E", "c", false); // 1
string.Compare("e", "c", false); // 1

For strict comparison that uses the character code ordering, you want to use an Ordinal comparison method.

// UPPERCASE letters come before lowercase.
string.Compare("A", "c", StringComparison.Ordinal); // -34
string.Compare("a", "c", StringComparison.Ordinal); // -2
string.Compare("E", "c", StringComparison.Ordinal); // -30
string.Compare("e", "c", StringComparison.Ordinal); // 2
  • Related