I am trying to run LibreOffice in a Docker container to convert some pages files to PDF. The application is a Web API and runs perfectly on a Windows Virtual Machine. I am new to Linux, Dockers and Containers.
I have been able to deploy everything to a container and call the API, but I am just getting an empty document back, and I have no idea why. I'm also unsure on the best way to try and debug this issue, so any advice is greatly appreciated.
Here is how I am installing Libre Office in the Docker file.
FROM mcr.microsoft.com/dotnet/aspnet:6.0 AS base
EXPOSE 80
RUN apt-get update
RUN apt-get install -y libreoffice
Here is the relevant part of my application responsible for doing the conversion.
string libreOfficeArgs = "--norestore --nofirststartwizard --headless --convert-to pdf \"{inputFile}\" --outdir \"{outputFolder}\"";
string libreOfficeExe = "/usr/bin/libreoffice";
//string libreOfficeExe = "/usr/bin/soffice"; Doesn't work either.
var conversionArgs = libreOfficeArgs.Replace("{inputFile}", inputPath).Replace("{outputFolder}", Path.GetDirectoryName(inputPath));
var conversionProcess = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = libreOfficeExe,
Arguments = conversionArgs
}
};
conversionProcess.Start();
await conversionProcess.WaitForExitAsync(); //TODO: Timeout?
conversionProcess.Close();
//I then read the output file into a stream and the API returns the stream
Any advice on how to investigate further or fix my problem would be greatly appreciated.
EDIT:
I can see in the logs the following so I think it could be related to how I am installing LibreOffice? As clearly the API is calling it.
convert /tmp/tmpuYq5ri.pages -> /tmp/tmpuYq5ri.pdf using filter : writer_Pdf_Export
EDIT 2:
Here is how the stream is being read.
var outputFilePath = Path.ChangeExtension(inputPath, "pdf");
var ms = new MemoryStream();
using (var fs = new FileStream(conversionOutput, FileMode.Open))
{
await fs.CopyToAsync(ms);
ms.Seek(0, SeekOrigin.Begin);
}
CodePudding user response:
It seems you are trying to convert a .pages
file. According to this source and this bug, trying to convert a pages file in old versions of LibreOffice yields a blank document, which would explain your issue.
Try updating LibreOffice to a version where this bug is fixed by modifying your Dockerfile
:
FROM mcr.microsoft.com/dotnet/aspnet:6.0 AS base
EXPOSE 80
RUN apt-get install libreoffice-java-common
ADD https://ftp.gwdg.de/pub/tdf/libreoffice/stable/7.4.3/deb/x86_64/LibreOffice_7.4.3_Linux_x86-64_deb.tar.gz .
RUN tar zxvf LibreOffice_7.4.3_Linux_x86-64_deb.tar.gz
RUN sh -c 'cd LibreOffice_7.4.3.2_Linux_x86-64_deb/DEBS && sh -c dpkg -i *.deb'
Also note the path of libreoffice is different. /opt/libreoffice7.4/program/soffice