I have this code to replace part of a string and remove white spaces:
let str = "باب ".replacingOccurrences(of: "باب", with: "").trimmingCharacters(in: .whitespacesAndNewlines)
print(str.count) /// gives 1 why not 0
But it gives me 1 always while it should be 0. Why?
CodePudding user response:
If it's RTL Mark (and it probably is), that's "\u{200F}" in Swift. If you want to trim it with the whitespace, you'd just add it to your set. That'd be something like:
.trimmingCharacters(in: whitespacesAndNewlines
.union(CharacterSet(charactersIn: "\u{200f}")))
You can also just replace that directly:
.replacingOccurrences(of: "\u{200f}باب", with: "")
Keep in mind the layout rules, since sometimes bidirectional literal strings like this can get confusing in the editor. You may want to separate the Arabic like:
let bab = "باب"
let rtl = "\u{200f}"
string.replacingOccurrences(of: rtl bab, with: "")
CodePudding user response:
Let's look at the content of your original string:
func hexCharactersArray(_ string: String) -> String {
string.unicodeScalars.map { String(format: "0x%X", $0.value)}.joined(separator: ",")
}
let originalString = "باب "
print(hexCharactersArray(originalString))
The result is [0x628,0x627,0x628,0x20,0x200F]
0x628 - arabic letter beh
0x627 - arabic letter alef
0x628 - arabic letter beh
0x20 - space
0x200F - right-to-left mark
The first three are letters, then some whitespace, but 0x200F is a unicode character in the category of control characters. It's not a letter and it's not whitespace.
When you do:
let replacedString = originalString.replacingOccurrences(of: "باب", with: "").trimmingCharacters(in: .whitespacesAndNewlines)
print(hexCharactersArray(replacedString))
you get [0x200F]
Because you've replaced the letters and trimmed out the whitespace, but you've left behind a control character.
If you want to trim that out too, use:
let replacedString = originalString.replacingOccurrences(of: "باب", with: "").trimmingCharacters(in: .whitespacesAndNewlines.union(.controlCharacters))