Home > Net >  Translate Java file compression to Python3 with GZIP
Translate Java file compression to Python3 with GZIP

Time:04-01

I need to compress a file into a specific format that is required by our country's tax regulation entity and it has to be sent encoded in base64.

I work on Python3 and attempted to do the compression with the following code:

import gzip

# Work file generated before and stored in BytesBuffer
my_file = bytes_buffer.getvalue()

def compress(work_file):
   encoded_work_file = base64.b64encode(work_file)
   compressed_work_file = gzip.compress(encoded_work_file )
   return base64.b64encode(compressed_work_file )
   
compress(my_file)

Now the tax entity returns an error message about an unknown compression format. Luckily, they provided us the following Java example code:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class DemoGZIP {

    private final byte[] BUFFER = new byte[1024];

    /**
     * @param work_file File to compress
     *      The file is compressed over the original file name with the extension .zip
     * @return boolean 
     *      TRUE success
     *      FALSE failure
     */
    public boolean compress(File work_file ) {                
        try (GZIPOutputStream  out = new GZIPOutputStream (new FileOutputStream(work_file .getAbsolutePath()   ".zip"));
                FileInputStream in = new FileInputStream(work_file )) {
            int len;
            while ((len = in.read(BUFFER)) != -1) {
                out.write(BUFFER, 0, len);
            }
            out.close();
        } catch (IOException ex) {
            System.err.println(ex.getMessage());
            return false;
        }
        return true;
    }

The problem is that I do not have any experience working on Java and do not understand much of the provided code.

Can someone please help me adapt my code to do what the provided code does in python?

CodePudding user response:

As noted in the comment, the Java code does not do Base64 coding, and names the resulting file incorrectly. It is most definitely not a zip file, it is a gzip file. The suffix should be ".gz". Though I doubt that the name matters to your tax agency.

More importantly, you are encoding with Base64 twice. From your description, you should only do that once, after gzip compression. From the Java code, you shouldn't do Base64 encoding at all! You need to get clarification on that.

  • Related