Home > Software engineering >  How to move certain files from one storage blob container to another?
How to move certain files from one storage blob container to another?

Time:08-09

I have been trying to find the best way to do the following: I need to move a large amount of json files that are named following the format "yyyymmdd-hhmmss.json" from one blob container to another that's in another storage account. These files are nested inside several different folders.

I only have to move the files that were created (or are named) before a certain date, for example: move all files that were created/are named before 01/01/2022.

What would be the best way to do so quickly? This is a one-time migration so it won't be recurring.

CodePudding user response:

To copy files in bulk from a Source to a Destination Blob Container:

Connect-AzAccount

Get-AzSubscription 
Select-AzSubscription -Subscription "My Subscription"
  
$srcResourceGroupName = "RG-DEMO-WE"
$srcStorageAccountName = "storageaccountdemowe"
$srcContainer = "sourcefolder"
$blobName = "dataDisk.vhd"

$destResourceGroupName = "RG-TRY-ME"
$destStorageAccountName = "storageaccounttryme"
$destContainer = "destinationfolder"
 
# Set Source & Destination Storage Keys and Context
$srcStorageKey = Get-AzStorageAccountKey -Name $srcStorageAccountName -ResourceGroupName $srcResourceGroupName 
 
$destStorageKey = Get-AzStorageAccountKey -Name $destStorageAccountName -ResourceGroupName $destResourceGroupName
 
$srcContext = New-AzStorageContext -StorageAccountName $srcStorageAccountName -StorageAccountKey $srcStorageKey.Value[0]
 
$destContext = New-AzStorageContext -StorageAccountName $destStorageAccountName -StorageAccountKey $destStorageKey.Value[0]

# Optional step 
New-AzStorageContainer -Name $destContainer  -Context $destContext   

# The copy operation 
$copyOperation = Start-AzStorageBlobCopy -SrcBlob $blobName `
                                         -SrcContainer $srcContainer `
                                         -Context $srcContext `
                                         -DestBlob $blobName `
                                         -DestContainer $destContainer `
                                         -DestContext $destContext
 

REF: https://www.jorgebernhardt.com/copy-blob-powershell/

Since you need to do individual files based on Date, instead of the Start-AzStorageBlobCopy the best is following the Microsoft Documentation with Async az storage file copy:

az storage file copy start --destination-path
                           --destination-share
                           [--account-key]
                           [--account-name]
                           [--connection-string]
                           [--file-endpoint]
                           [--file-snapshot]
                           [--metadata]
                           [--sas-token]
                           [--source-account-key]
                           [--source-account-name]
                           [--source-blob]
                           [--source-container]
                           [--source-path]
                           [--source-sas]
                           [--source-share]
                           [--source-snapshot]
                           [--source-uri]
                           [--timeout]

REF: https://docs.microsoft.com/en-us/cli/azure/storage/file/copy?view=azure-cli-latest

The code to loop through the files based on date I'll leave to the reader, eg:

Get-ChildItem | Where-Object {$_.LastWriteTime -lt (Get-Date).AddDays(-30)}

CodePudding user response:

You can iterate each blob in the source container (No matter how the folder structure is, as blob folders are simply virtual), and you can parse the name of the blob to filter blobs matching the pattern "yyyymmdd-hhmmss" and find the date and if it is older than the date that you wish to choose as a condition, you can easily copy the blob from your source to destination container, and finally delete the blob from the source container. Not sure about power shell, but its easy with any supported programming language.

Here's an example of doing this with .Net:

BlobContainerClient sourceContainerClient = new BlobContainerClient("<source-connection-string>", "<source-container-name>");
BlobContainerClient destinationContainerClient = new BlobContainerClient("<destination-connection-string>", "<destination-container-name>");
var blobList = sourceContainerClient.GetBlobs();
DateTime givenDateTime = DateTime.Now;
foreach (var blobItem in blobList)
{
    try
    {
        var sourceBlob = sourceContainerClient.GetBlobClient(blobItem.Name);
        string blobName = sourceBlob.Uri.Segments.Last().Substring(0, sourceBlob.Uri.Segments.Last().IndexOf('.'));

        if (DateTime.Compare(DateTime.ParseExact(blobName, "yyyyMMdd-hhmmss", CultureInfo.InvariantCulture), givenDateTime) < 0)
        {
            var destinationBlob = destinationContainerClient.GetBlockBlobClient(blobName);
            destinationBlob.StartCopyFromUri(sourceBlob.Uri);
            sourceBlob.Delete();
        }
    }
    catch { }
}
  • Related