Home > OS >  My Kotlin code converts Latin alphabet characters to binary code but non-Latin alphabet characters(A
My Kotlin code converts Latin alphabet characters to binary code but non-Latin alphabet characters(A

Time:10-30

Hello everyone my code converts Latin alphabet characters to binary but crashes when i try converting non-Latin alphabet characters. Can you help me so my code can convert every alphabet?

fun strToBinary(str: String): String {
    val builder = StringBuilder()

    for (c in str.toCharArray()) {
        val toString = c.code.toString(2) // get char value in binary
        builder.append(String.format("d", Integer.parseInt(toString))) // we complete to have 8 digits
    }

    return builder.toString()
}

When i try non-Latin characters it gives this exception.

Exception in thread "main" java.lang.NumberFormatException: For input string: "11000100011"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:583)
    at java.lang.Integer.parseInt(Integer.java:615)
    at BinaryKt.strToBinary(Binary.kt:6)
    at BinaryKt.main(Binary.kt:41)
    at BinaryKt.main(Binary.kt)

CodePudding user response:

You wrongly assumed that a character always takes 1 byte. This is only true when using ASCII, but less common characters may take as much as even 4 bytes per char.

I suggest first encoding the whole string into ByteArray using UTF-8 and then converting it byte by byte:

fun strToBinary(str: String) = buildString {
    str.toByteArray().forEach {
        append(it.toUByte()
            .toString(2)
            .padStart(8, '0')
        )
    }
}

Note that such encoding takes quite a log of space. Resulting string is at least 8 times longer than the original text. You can use hex encoding to make it 4 times shorter:

fun strToHex(str: String) = buildString {
    str.toByteArray().forEach {
        append(it.toUByte()
            .toString(16)
            .padStart(2, '0')
        )
    }
}

Or make it even shorter using base64 encoding:

fun strToBase64(str: String) = Base64.getEncoder().encodeToString(str.toByteArray())

Update

To decode the string we basically need to reverse all steps. For example, for decoding from binary we need to chunk the string into 8-chars parts, decode each of them into a single byte, create byte array and then decode into string using UTF-8:

fun binaryToStr(binary: String) =
    binary.chunked(8)
        .map { it.toUByte(2).toByte() }
        .toByteArray()
        .decodeToString()
  • Related