I have API that produces results in specific single-byte charset (WIN 1257) and I am reading this result in Kotlin as:
val connection = URL("http://192.168.1.21:92/someAPI").openConnection() as HttpURLConnection
var byteArray: ByteArray = ByteArray(10000000)
connection.inputStream.read(byteArray)
val tmp = String(byteArray, Charsets.UTF_8).trim()
Of course, this is clearly incorrect code, because it presumes that byteArray is the representation of the string that is encoded in UTF-8. It may be desirable to correct this code by using Charsets.WIN_1257
, but there is no such option in Kotlin. My byte array is the representation of the string that is WIN-1257 encoded - how can I get UTF-8 string?
Here is simple test code that isolates my problem and that can be run in https://play.kotlinlang.org:
/**
* You can edit, run, and share this code.
* play.kotlinlang.org
*/
fun main() {
var byteArray: ByteArray = listOf(0xe2, 0x72).map { it.toByte() }.toByteArray()
println(String(byteArray, Charsets.UTF_8))
}
On can se that UTF_8 produces the result:
�r
But I expect:
ār
CodePudding user response:
Look into Charset.availableCharsets
; just Charset.forName("Windows-1257")
might work.