In my Go project, I am dealing with asian languages and There are double byte characters. In my case, I have a string which contains two words and there is a space between them.
EG: こんにちは 世界
Now I need to check if that space is a double byte space and if so, I need to convert that into single byte space.
I have searched a lot, but I couldn't find a way to do this. Since I cannot figure out a way to do this, sorry I have no code sample to add here.
Do I need to loop through each character and pick the double byte space using its code and replace? What is the code I should use to identify double byte space?
CodePudding user response:
Just replace?
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.Replace("こんにちは 世界", " ", " ", -1))
}
Notice that the second argument in Replace
is
, as copy-paste from your string in example. This replace function will find all rune that match that in the original string and replace it with ASCII space
CodePudding user response:
In golang there is nothing like double byte character. There is special type rune
which is int32
under hood and rune is unicode representation.
your special space is 12288
and normal space is 32
unicode.
To iterate over characters you can use range
for _, char := range chars {...} // char is rune type
To replace this character you can use strings.Replace
or strings.Map
and define function for replacement of unwanted characters.
func converter(r rune) rune {
if r == 12288 {
return 32
}
return r
}
result := strings.Map(converter, "こんにちは 世界")
It is also posible to use characters literals instead of numbers
if r == ' ' {
return ' '
}