Home > Net >  How to avoid file content repetition zipfile
How to avoid file content repetition zipfile

Time:05-10

I need to compress multiple xml files and I achieved this with lxml, zipfile and a for loop.

My problem is that every time I re run my function the content of the compressed files are repeating (being appended in the end) and getting longer. I believe that it has to do with the writing mode a b. I thought that by using with open at the end of the code block the files would be deleted and no more content would be added to them. I was wrong and with the other modes I do not get the intended result.

Here is my code:

def compress_package_file(self):
   bytes_buffer = BytesIO()
   with zipfile.ZipFile(bytes_buffer, 'w') as invoices_package:
       i = 1
       for invoice in record.invoice_ids.sorted('sin_number'):
           invoice_file_name = 'Invoice_'   invoice.number   '.xml'
           with open(invoice_file_name, 'a b') as invoice_file:
               invoice_file.write(invoice._get_invoice_xml().getvalue())
               invoices_package.write(invoice_file_name, compress_type=zipfile.ZIP_DEFLATED)
           i  = 1
   compressed_package = bytes_buffer.getvalue()
   encoded_compressed_file = base64.b64encode(compressed_package)               

My xml generator is in another function and works fine. But the content repeats each time I run this function. For example if I run it two times, the content of the files in the compressed file look something like this (simplified content):

<?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
    <header>
        <invoiceNumber>9</invoiceNumber>
    </header>
</facturaComputarizadaCompraVenta><?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
    <header>
        <invoiceNumber>9</invoiceNumber>
    </header>
</facturaComputarizadaCompraVenta>

If I use w b mode, the content of the files are blank. How should my code look like to avoid this behavior?

CodePudding user response:

I suggest you do use w b mode, but move writing to zipfile after closing the invoice XML file.

From what you wrote it looks as you are trying to compress a file that is not yet flushed to disk, therefore with w b it is still empty at time of compression.

So, try remove 1 level of indent for invoices_package.write line (I can't format code properly on mobile, so can't post whole section).

  • Related