The goal is to return three pandas dataframes (result1
, result2
and result3
) as three sheets in an excel file as response to a fastapi route. This is how the end of the function looks like
from fastapi.responses import StreamingResponse
...
with BytesIO() as buffer:
# pylint: disable=abstract-class-instantiated
with pandas.ExcelWriter(path=buffer, engine=XCL_ENGN) as writer:
result1.to_excel(excel_writer=writer, sheet_name='Report 1', index=False)
result2.to_excel(excel_writer=writer, sheet_name='Report 2', index=False)
result3.to_excel(excel_writer=writer, sheet_name='Report 3', index=False)
return StreamingResponse(content=BytesIO(initial_bytes=buffer.getvalue()),
headers=headers)
The code is working as expected. However, the return statement seems to open a bytestream that forms the content being returned, but never operates close()
on it. I tried wrapping that inside another with statement, but seems that throws a
ValueError: I/O operation on closed file.
I also tried the contextlib.closing method, but same error. So is there any resource being leaked here and how to avoid it?
Also, it seems when I am writing the data to the first buffer, then forming another bytestream just to return the value (which essentially copied everything, perhaps) kinda inefficient. Any way to make this part of the code cleaner and leakage free, adhering to best context management practices?
CodePudding user response:
As discussed in the comments, so long as it's okay to buffer the Excel worksheet fully into memory, StreamingResponse
is not necessary here at all.
from fastapi.responses import Response
# ...
with BytesIO() as buffer:
with pandas.ExcelWriter(path=buffer, engine=XCL_ENGN) as writer:
result1.to_excel(...)
data = buffer.getvalue()
return Response(content=data, headers=headers)
will do just fine.
If the worksheet is too large to keep in memory, then you could use e.g. a tempfile.TemporaryFile
and a background task to dispose of it once the response is written.