Home > Software design >  how to download all blob from a container where blob sits in sub directory style
how to download all blob from a container where blob sits in sub directory style

Time:10-13

I have blob files which sits under container in a sub-directory style,

  • container name = log-test
  • Year folder
  • Month folder
  • Date folder
  • Hour folder

enter image description here

Through C# background job every hour I need to downloads all the blob files for that Hour folder.

Here in below code I am able to download one file with hard code path (Year/Month/Date/Hour/blob-name), but how to download all the blob file for current date/time hour?

 var blobServiceClient = new BlobServiceClient("conn-str");
        BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient("log-test");

        BlobClient blobClient = containerClient.GetBlobClient("2021/08/11/01/log.txt");

        await blobClient.DownloadToAsync(@"C:\Temp\Result.txt");

Note: - I tried this, but is this a performance issue? Can't I go directly to the actual HOUR folder?

var blobItems = containerClient.GetBlobs().Where(blobItem => blobItem.Properties.LastModified != null 
                                                             && blobItem.Properties.LastModified.Value.Year == DateTime.UtcNow.Year 
                                                             && blobItem.Properties.LastModified.Value.Month == DateTime.UtcNow.Month 
                                                             && blobItem.Properties.LastModified.Value.Day == DateTime.UtcNow.Day 
                                                             && blobItem.Properties.LastModified.Value.Hour == DateTime.UtcNow.Hour)
            .ToList();

        foreach (var blob in blobItems)
        {
            BlobClient blobClient = containerClient.GetBlobClient(blob.Name);

            await blobClient.DownloadToAsync($"C:\\Temp\\{blob.Name.Replace('/','-')}");
        }

CodePudding user response:

This code of yours

var blobItems = containerClient.GetBlobs().Where(blobItem => blobItem.Properties.LastModified != null 
                                                             && blobItem.Properties.LastModified.Value.Year == DateTime.UtcNow.Year 
                                                             && blobItem.Properties.LastModified.Value.Month == DateTime.UtcNow.Month 
                                                             && blobItem.Properties.LastModified.Value.Day == DateTime.UtcNow.Day 
                                                             && blobItem.Properties.LastModified.Value.Hour == DateTime.UtcNow.Hour)
            .ToList();

is not optimized as it lists all blobs in the container and then do the filtering on the client side.

The method you would want to use is BlobContainerClient.GetBlobsByHierarchy and specify Year/Month/Date/Hour/ as the value for the prefix parameter. You will then only get the blobs for that hour. Once you have that list, then you can download the blobs individually.

  • Related