I have a set of strings I need to sort in an order which is not Latin alphabetic. Specifically, I have a string "AiyawbpfmnrhHxXzsSqkgtTdD" which specifies the sorting order, i.e., "y" comes before "a", but after "A". In case you are interested, this is the sort order for ancient Egyptian hieroglyphs as specified in the Manuel de Codage.
In Swift, is there a convenient way to specify a predicate or other approach for this type of collation order?
CodePudding user response:
First, turn your alphabet into a Dictionary
that maps each Character
to its integer position in the alphabet:
import Foundation
let hieroglyphAlphabet = "AiyawbpfmnrhHxXzsSqkgtTdD"
let hieroglyphCodes = Dictionary(
uniqueKeysWithValues: hieroglyphAlphabet
.enumerated()
.map { (key: $0.element, value: $0.offset) }
)
Next, extend StringProtocol
with a property that returns an array of such alphabetic positions:
extension StringProtocol {
var hieroglyphEncoding: [Int] { map { hieroglyphCodes[$0] ?? -1 } }
}
I'm turning non-alphabetic characters into -1, so they will be treated as less-than alphabetic characters. You could turn them into .max
to treat them as greater-than, or use a more complex type than Int
if you need more special treatment.
Now you can sort an array of strings by hieroglyphEncoding
, using the lexicographicallyPrecedes
method of Sequence
:
let unsorted = "this is the sort order".components(separatedBy: " ")
let sorted = unsorted.sorted {
$0.hieroglyphEncoding.lexicographicallyPrecedes($1.hieroglyphEncoding)
}
print(sorted)
Output:
["order", "is", "sort", "the", "this"]
It is not efficient to recompute the hieroglyphEncoding
of each string on demand during the sort, so if you have many strings to sort, you should wrap each string and its encoding into a wrapper for sorting or use a Schwartzian transform.