I want to copy files from A blob storage to blob B, with a condition to copy/write only new files to storage B. I'm using BlockBlobService package to list all files one blob A, but does this package also have a function to copy/write a new file to another blob storage?? I'm writing this in Python btw.. Please help me out :(...I'm a bit helpless now I tried to use this package DataLakeServiceClient to write a file to azure blob storage B. But this packaged DataLakeServiceClient is not compatible with BlockBlobService. So I do know what to do.
If you have tried another method to do the same thing I want to do, please share with me your wisdom and knowledge.
CodePudding user response:
I would say, try azcopy tool. It support copying data between storage accounts. For example:
azcopy copy 'https://sourceacc.blob.core.windows.net/container/dir' 'https://destacc.blob.core.windows.net/container' --recursive
Then, use it with the --overwrite
flag with value false
or ifSourceNewer
to specify the behavior for existing blobs at the destination:
--overwrite (string) Overwrite the conflicting files and blobs at the destination if this flag is set to true. (default 'true') Possible values include 'true', 'false', 'prompt', and 'ifSourceNewer'. For destinations that support folders, conflicting folder-level properties will be overwritten this flag is 'true' or if a positive response is provided to the prompt. (default "true")
See this doc for how to get started.
CodePudding user response:
After reproducing from my end, I could able to achieve this using get_blob_to_path
and create_blob_from_path
of BlockBlobService. Below is the complete code that worked for me.
from azure.storage.blob import BlockBlobService
import os
SOURCE_ACCOUNT_NAME = "<source_Account_Name>"
SOURCE_CONTAINER_NAME = "<source-container>"
SOURCE_SAS_TOKEN='<Source_Storage_Account_SAS_Token>'
DESTINATION_ACCOUNT_NAME = "<destination_Account_Name>"
DESTINATION_CONTAINER_NAME = "<destination-container>"
DESTINATION_SAS_TOKEN='<Destination_Storage_Account_SAS_Token>'
source_blob_service = BlockBlobService(account_name=SOURCE_ACCOUNT_NAME,account_key=None,sas_token=SOURCE_SAS_TOKEN)
destination_blob_service = BlockBlobService(account_name=DESTINATION_ACCOUNT_NAME,account_key=None,sas_token=DESTINATION_SAS_TOKEN)
generator = source_blob_service.list_blobs(SOURCE_CONTAINER_NAME)
for blob in generator:
blobname=blob.name
source_blob_service.get_blob_to_path(SOURCE_CONTAINER_NAME,blobname,blobname,'wb')
destination_blob_service.create_blob_from_path(DESTINATION_CONTAINER_NAME,blobname,blobname)
os.remove(blobname)
RESULTS: