I am testing some AI Document analysis stuff, and am currently trying to allow users to Upload Files to a WebApp, which in turn sends them to Azure Form Recognizer and processes the results.
I am however not able to do so in a single Request.
This is how the Files are represented:
[BindProperty] public List<IFormFile> Upload { get; set; }
I can iterate over these and get the expected results, but this makes the operation take quite long. I would like to just send all of the files in one request (as shown below), but it only ever analyzes the first one. I am using Azure.AI.FormRecognizer.DocumentAnalysis, so the client and StartAnalyzeDocument Method is from there.
using (var stream = new MemoryStream())
{
foreach (IFormFile formFile in Upload)
{
formFile.CopyTo(stream);
}
stream.Seek(0, SeekOrigin.Begin);
AnalyzeDocumentOperation operation = client.StartAnalyzeDocument(modelId, stream);
operation.WaitForCompletion();
Console.WriteLine("This many documents were analysed: " operation.Value.Documents.Count);
result = operation.Value;
};
"result" is what I process later on. I am quite stumped on this, as I would have expected the appended stream to just work. If anyone has a solution or could point me in the right direction, it would be much appreciated.
CodePudding user response:
Form Recognizer does not yet support processing multiple documents in a single analyze operation for prebuilt-invoice and custom models. Furthermore, most file formats cannot just be appended together to concatenate the content.
One way to speed up the analysis of multiple files in a batch is to call the analyze operation in parallel. Here is a sketch.
var results = Upload.AsParallel().ForAll(formFile =>
{
using (var stream = formFile.OpenReadStream())
{
var operation = client.StartAnalyzeDocument(modelId, stream);
operation.WaitForCompletion();
return operation.Value;
}
}).ToArray();