I need to compress multiple xml files and I achieved this with lxml, zipfile and a for loop.
My problem is that every time I re run my function the content of the compressed files are repeating (being appended in the end) and getting longer. I believe that it has to do with the writing mode a b. I thought that by using with open at the end of the code block the files would be deleted and no more content would be added to them. I was wrong and with the other modes I do not get the intended result.
Here is my code:
def compress_package_file(self):
bytes_buffer = BytesIO()
with zipfile.ZipFile(bytes_buffer, 'w') as invoices_package:
i = 1
for invoice in record.invoice_ids.sorted('sin_number'):
invoice_file_name = 'Invoice_' invoice.number '.xml'
with open(invoice_file_name, 'a b') as invoice_file:
invoice_file.write(invoice._get_invoice_xml().getvalue())
invoices_package.write(invoice_file_name, compress_type=zipfile.ZIP_DEFLATED)
i = 1
compressed_package = bytes_buffer.getvalue()
encoded_compressed_file = base64.b64encode(compressed_package)
My xml generator is in another function and works fine. But the content repeats each time I run this function. For example if I run it two times, the content of the files in the compressed file look something like this (simplified content):
<?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
<header>
<invoiceNumber>9</invoiceNumber>
</header>
</facturaComputarizadaCompraVenta><?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
<header>
<invoiceNumber>9</invoiceNumber>
</header>
</facturaComputarizadaCompraVenta>
If I use w b mode, the content of the files are blank. How should my code look like to avoid this behavior?
CodePudding user response:
I suggest you do use w b mode, but move writing to zipfile after closing the invoice XML file.
From what you wrote it looks as you are trying to compress a file that is not yet flushed to disk, therefore with w b it is still empty at time of compression.
So, try remove 1 level of indent for invoices_package.write line (I can't format code properly on mobile, so can't post whole section).