Home > Net >  How to read multiple files from a storage container in Azure Functions
How to read multiple files from a storage container in Azure Functions

Time:11-17

I have an Azure Functions application (Python) where I have to read multiple CSV files that are stored in an Azure Storage Account (StorageV2) to validate them.

However, the filenames and amount of CSV files in this folder change over time. The application is triggered using an HTTP binding and it would be ideal to dynamically check for the contents of the folder and then sequentially process all the CSV files in the folder.

From the documentation it seems that Azure Functions uses bindings for in- and output, however, the examples only show (multiple) input bindings that point to a single file, and not a folder/container of any kind. Because I do not know the amount of files and the file names beforehand, this would be difficult to implement.

E.g: function.json

{
  "bindings": [
    {
      "authLevel": "function",
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "methods": [
        "get",
        "post"
      ]
    },
    {
      "name": "inputcsv",
      "type": "blob",
      "dataType": "binary",
      "path": "samplesCSVs/{singleCSVfile}",
      "connection": "MyStorageConnectionAppSetting",
      "direction": "in"
    },
    {
      "type": "http",
      "direction": "out",
      "name": "$return"
    }
  ]
  "scriptFile": "__init__.py"
}

Is it possible to point to a folder here? Or dynamically read the files in a Storage Account in another way?

The only other alternative that I can think of is to simply zip all the CSV files in advance, so I can use one input binding to this zipped file and then unpack them in a temporary folder to process them, but it would be less efficient.

Sources:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-input?tabs=python

https://docs.microsoft.com/en-us/azure/azure-functions/functions-add-output-binding-storage-queue-vs-code?tabs=in-process&pivots=programming-language-python

CodePudding user response:

Using Azure Blob Trigger you can only match one-to-one, a change or creation of a new blob, will trigger the execution of a function.

You can use Event Grid and filter events at the container level, and use an Azure Function to handle that particular event:

https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-event-overview

CodePudding user response:

It seems I had a misunderstanding about how Azure Functions works. Because it is still Python code and Azure has a Python SDK available to connect to a Storage account and manipulate files, this is the best way to achieve the task that I was trying to accomplish.

The input/output bindings of Azure Functions is only helpful when using specific triggers it seems, but this was not required for my problem.

Thanks to zolty13 for pointing me in the right direction.

Source:

https://docs.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python

  • Related