I'm building a solution who extract data from invoice PDF with the Microsoft Form Recgnizer Api and add the information to an SQL Database.
I build a parser and a code who add rows from the API response to a database when a pdf is uploaded in my storage blob with success.
I'm looking for the easiest way to handle multiple PDF coming at the same time in my database.
I'm facing deadlocks issues when i make a test with multiple pdf incoming because there is process conflict in SQL server: if i upload 4 pdf, the 4 PDF are processed a the same time and are being parsed and data added to SQL at the same time, which cause conflict and potentially non logical arrangement of the database rows (i don't want to make an update group By invoice number after each process to re arrange the whole table).
Now, i'm looking at a solution who can take every element incoming in storage blob one after another, instead of all at the same time. Something like a For loop who iterate sequentially on every blob "source" and send them in entry for my parsing function.
Or something like a queue who could work like this :
PDF1, PDF2, PDF3 incoming in storage blob :
Make PDF2 and PDF3 waiting, send PDF1 to API analyse, add data to SQL and when last row added, send PDF2 to API analyse, add data to SQL and when last row added, send PDF3 etc
Thanks for your suggestion:)
CodePudding user response:
You can route Azure blob storage events to an Azure Function. https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-event-overview