Home > Back-end >  How to get file length in Go dynamically?
How to get file length in Go dynamically?

Time:09-14

I have the following code snippet:

func main() {
    // Some text we want to compress.
    original := "bird and frog"
    
    // Open a file for writing.
    f, _ := os.Create("C:\\programs\\file.gz")
    
    // Create gzip writer.
    w := gzip.NewWriter(f)
    
    // Write bytes in compressed form to the file.
    while ( looping over database cursor) {
       w.Write([]byte(/* the row from the database as obtained from cursor */))
    }
    
    // Close the file.
    w.Close()
    
    fmt.Println("DONE")
}

However, I wish to know a small modification. When the size of file reaches a certain threshold I want to close it and open a new file. And that too in compressed format.

For example:

Assume a database has 10 rows each row is 50 bytes.

Assume compression factor is 2, ie 1 row of 50 bytes is compressed to 25 bytes.

Assume a file size limit is 50 bytes.

Which means after every 2 records I should close the file and open a new file.

How to keep track of the file size while its still open and still writing compressed documents to it ?

CodePudding user response:

You can use the os.File.Seek method to get your current position in the file, which as you're writing the file will be the current file size in bytes.

For example:

package main

import (
    "compress/gzip"
    "fmt"
    "os"
)

func main() {
    // Some text we want to compress.
    lines := []string{
        "this is a test",
        "the quick brown fox",
        "jumped over the lazy dog",
        "the end",
    }

    // Open a file for writing.
    f, err := os.Create("file.gz")
    if err != nil {
        panic(err)
    }

    // Create gzip writer.
    w := gzip.NewWriter(f)

    // Write bytes in compressed form to the file.
    for _, line := range lines {
        w.Write([]byte(line))

        w.Flush()
        pos, err := f.Seek(0, os.SEEK_CUR)
        if err != nil {
            panic(err)
        }

        fmt.Printf("pos: %d\n", pos)
    }

    // Close the file.
    w.Close()

    // The call to w.Close() will write out any remaining data
    // and the final checksum.
    pos, err := f.Seek(0, os.SEEK_CUR)
    if err != nil {
        panic(err)
    }
    fmt.Printf("pos: %d\n", pos)

    fmt.Println("DONE")
}

Which outputs:

pos: 30
pos: 55
pos: 83
pos: 94
pos: 107
DONE

And we can confirm with wc:

$ wc -c file.gz
107 file.gz

CodePudding user response:

gzip.NewWriter takes a io.Writer. It is easy to implement custom io.Writer that does what you want.

E.g. Playground

type MultiFileWriter struct {
    maxLimit      int
    currentSize   int
    currentWriter io.Writer
}

func (m *MultiFileWriter) Write(data []byte) (n int, err error) {
    if len(data) m.currentSize > m.maxLimit {
        m.currentWriter = createNextFile()
    }
    m.currentSize  = len(data)
    return m.currentWriter.Write(data)
}

Note: You will need to handle few edge cases like what if len(data) is greater than the maxLimit. And may be you don't want to split a record across files.

  • Related