Home > OS >  Why does Base64 buffer sizing make it larger than the length of the underlying text?
Why does Base64 buffer sizing make it larger than the length of the underlying text?

Time:10-02

I am trying to encode a byte array as Base64 and running into two issues. I can do this with base64.StdEncoding.EncodedLen(text) but I'm worried that's costly, so I wanted to see if I could do it just with len(text). Here is the code (the functions are named "Marshal" because I'm using them as a field converter during JSON Marshaling):

package main

import (
    "crypto/rand"
    "encoding/base64"
    "fmt"
)

func main() {
    b := make([]byte, 60)
    _, _ = rand.Read(b)

    // Marshal Create Dst Buffer
    MarshalTextBuffer(b)

    // Marshal Convert to String
    MarshalTextStringWithBufferLen(b)

    // Marshal Convert to String
    MarshalTextStringWithDecodedLen(b)
}

func MarshalTextBuffer(text []byte) error {
    ba := base64.StdEncoding.EncodeToString(text)
    fmt.Println(ba)
    return nil
}

func MarshalTextStringWithBufferLen(text []byte) error {
    ba := make([]byte, len(text) 30) // Why does len(text) not suffice? Temporarily using '30' for now, just so it doesn't overrun.
    base64.StdEncoding.Encode(ba, text)
    fmt.Println(ba)
    return nil
}

func MarshalTextStringWithDecodedLen(text []byte) error {
    ba := make([]byte, base64.StdEncoding.EncodedLen(len(text)))
    base64.StdEncoding.Encode(ba, text)
    fmt.Println(ba)
    return nil
}

Here's the output:

IL5CW8T9WSgwU5Hyi9JsLLkU/EcydY6pG2fgLQJsMaXgxhSh74RTagzr6b9yDeZ8CP4Azc8xqq5/ Cgk
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107 0 0 0 0 0 0 0 0 0 0]
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107]

Why does the middle one MarshalTextStringWithBufferLen require extra padding?

Is base64.StdEncoding.EncodedLen a costly function (e.g. I can solve it with the bottom function, but I worry about the cost).

CodePudding user response:

Base-64 encoding stores binary data (8 bits per byte) as text (using 6 bits per byte), so every 3 bytes is encoded as 4 bytes (3x8 = 4x6). So len(text) 30 in your code is wrong, and should be len(text)*4/3 (if len(text) is divisible by 3) but to make for readability and to avoid bugs you should be using base64.StdEncoding.EncodedLen() to get the length.

If you look at the code for base64.StdEncoding.EncodedLen you will see that it is as fast as doing the calcs yourself (esp. as it will be in-lined).

  • Related