Home > Mobile >  String sort using CharacterSet order
String sort using CharacterSet order

Time:06-15

I am trying to alphabetically sort an array of non-English strings which contain a number of special Unicode characters. I can create a CharacterSet sequence which contains the desired lexicographic sort order.

Is there an approach in Swift5 to performing this type of customized sort? I believe I saw such a function some years back, but a pretty exhaustive search today failed to turn anything up.

Any pointers would be appreciated!

CodePudding user response:

As a simple implementation of matt's cosorting comment:

// You have `t` twice in your string; I've removed the first one.
let alphabet = "ꜢjiyꜤwbpfmnRrlhḥḫẖzsšqkgtṯdḏ "

// Map characters to their location in the string as integers
let order = Dictionary(uniqueKeysWithValues: zip(alphabet, 0...))

// Make the alphabet backwards as a test string
let string = alphabet.reversed()

// This sorts unknown characters at the end. Or you could throw instead.
let sorted = string.sorted { order[$0] ?? .max < order[$1] ?? .max }

print(sorted)

CodePudding user response:

Rather than building your own “non-English” sorting, you might consider localized comparison. E.g.:

let strings = ["a", "á", "ä", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t"]

let result1 = strings.sorted()
print(result1) // ["a", "b", "c", "d", "e", "f", "r", "s", "t", "ß", "á", "ä", "é"]

let result2 = strings.sorted {
    $0.localizedCaseInsensitiveCompare($1) == .orderedAscending 
}
print(result2) // ["a", "á", "ä", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t"]

let locale = Locale(identifier: "sv")
let result3 = strings.sorted {
    $0.compare($1, options: .caseInsensitive, locale: locale) == .orderedAscending
}

print(result3) // ["a", "á", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t", "ä"]

And a non-Latin example:

let strings = ["あ", "か", "さ", "た", "い", "き", "し", "ち", "う", "く", "す", "つ", "ア", "カ", "サ", "タ", "イ", "キ", "シ", "チ", "ウ", "ク", "ス", "ツ", "が", "ぎ"]

let result4 = strings.sorted {
    $0.localizedCaseInsensitiveCompare($1) == .orderedAscending 
}
print(result4) // ["あ", "ア", "い", "イ", "う", "ウ", "か", "カ", "が", "き", "キ", "ぎ", "く", "ク", "さ", "サ", "し", "シ", "す", "ス", "た", "タ", "ち", "チ", "つ", "ツ"]
  • Related