I'm working on a Blazor WASM App and I want my users to easily open pdf files on specific pages that contain additional information. I cannot distribute those files myself or upload them to any kind of server. Each user has to provide them themselves.
Because the files are up to 60MB big I cannot convert the uploaded file to base64 and display them as described here.
However I don't have to display the whole file and could just load the needed page - some pages around them.
For that I tried using iText7 ExtractPageRange()
. This answer indicates, that I have to override the GetNextPdfWriter()
Method and to store all streams in an collection.
class ByteArrayPdfSplitter : PdfSplitter {
public ByteArrayPdfSplitter(PdfDocument pdfDocument) : base(pdfDocument) {
}
protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange) {
CurrentMemoryStream = new MemoryStream();
UsedStreams.Add(CurrentMemoryStream);
return new PdfWriter(CurrentMemoryStream);
}
public MemoryStream CurrentMemoryStream { get; private set; }
public List<MemoryStream> UsedStreams { get; set; } = new List<MemoryStream>();
Then I thought I could merge those streams and convert them to base64
var file = loadedFiles.First();
using (MemoryStream ms = new MemoryStream())
{
var rs = file.OpenReadStream(maxFileSize);
await rs.CopyToAsync(ms);
ms.Position = 0;
//rs needed to be converted to ms, because the PdfReader constructer uses a
//synchronious read that isn't supported by rs and throws an exception.
PdfReader pdfReader = new PdfReader(ms);
var document = new PdfDocument(pdfReader);
var splitter = new ByteArrayPdfSplitter(document);
var range = new PageRange();
range.AddPageSequence(1, 10);
var splitDoc = splitter.ExtractPageRange(range);
//Edit commented this out, shouldn't have been here at all leads to an exception
//splitDoc.Close();
var outputMs = new MemoryStream();
foreach (var usedMs in splitter.UsedStreams)
{
usedMs.Position = 0;
outputMs.Position = outputMs.Length;
await usedMs.CopyToAsync(outputMs);
}
var data = outputMs.ToArray();
currentPdfContent = "data:application/pdf;base64,";
currentPdfContent = Convert.ToBase64String(data);
pdfLoaded = true;
}
This however doesn't work. Has anyone a suggestion how to get this working? Or maybe a simpler solution I could try.
Edit:
I took a closer look in debug and it seems like, the resulting stream outputMs
is always empty. So it is probably a problem in how I split the pdf.
CodePudding user response:
After at least partially clearing up my misconception of what it means to not being able to access the file system from blazor WASM I managed to find a working solution.
await using MemoryStream ms = new MemoryStream();
var rs = file.OpenReadStream(maxFileSize);
await using var fs = new FileStream("test.pdf", FileMode.Create)
fs.Position = 0;
await rs.CopyToAsync(fs);
fs.Close();
string path = "test.pdf";
string range = "10 - 15";
var pdfDocument = new PdfDocument(new PdfReader("test.pdf"));
var split = new MySplitter(pdfDocument);
var result = split.ExtractPageRange(new PageRange(range));
result.Close();
await using var splitFs = new FileStream("split.pdf", FileMode.Open))
await splitFs.CopyToAsync(ms);
var data = ms.ToArray();
var pdfContent = "data:application/pdf;base64,";
pdfContent = System.Convert.ToBase64String(data);
Console.WriteLine(pdfContent);
currentPdfContent = pdfContent;
With the MySplitter Class from this answer.
class MySplitter : PdfSplitter
{
public MySplitter(PdfDocument pdfDocument) : base(pdfDocument)
{
}
protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange)
{
String toFile = "split.pdf";
return new PdfWriter(toFile);
}
}