To reduce the default 64k scanner buffer (for microcomputer with low memory), I try to use this buffer and custom split functions:
scanner.Buffer(make([]byte, 5120), 64)
scanner.Split(Scan64Bytes)
Here I noticed that the second buffer argument "max" has no effect. If I instead insert e.g. 0, 1, 5120 or bufio.MaxScanTokenSize, I can' t see any difference. Only the first argument "buf" has consequences. Is the capacity to small the scan is incomplete and if it's to large the B/op benchmem value increases.
From the doc:
The maximum token size is the larger of max and cap(buf). If max <= cap(buf), Scan will use this buffer only and do no allocation.
I don't understand which is the correct max value. Can you maybe explain this to me, please?
package main
import (
"bufio"
"bytes"
"fmt"
)
func Scan64Bytes(data []byte, atEOF bool) (advance int, token []byte, err error) {
if len(data) < 64 {
return 0, data[0:], bufio.ErrFinalToken
}
return 64, data[0:64], nil
}
func main() {
// improvised source of the same size:
cmdstd := bytes.NewReader(make([]byte, 5120))
scanner := bufio.NewScanner(cmdstd)
// I guess 64 is the correct max arg:
scanner.Buffer(make([]byte, 5120), 64)
scanner.Split(Scan64Bytes)
for i := 0; scanner.Scan(); i {
fmt.Printf("%v: %v\r\n", i, scanner.Bytes())
}
if err := scanner.Err(); err != nil {
fmt.Println(err)
}
}
CodePudding user response:
max value has no effect on custom Split?
No, without split there is the same result. But this wouldn't be possible without split and ErrFinalToken:
//your reader/input
cmdstd := bytes.NewReader(make([]byte, 5120))
// your scanner buffer size
scanner.Buffer(make([]byte, 5120), 64)
The buffer size from the scanner should be larger. This is how I would set buf and max:
scanner.Buffer(make([]byte, 5121), 5120)