I would like to manipulate (delete) with files and folders in my ADLS (container) using Python SDK. I have 2 issues:
- Which of the hundreds of Azure SDKs to use for this purpose?
- How to authenticate using AAD token? I really prefer authenticating this way, or also using credentials for service principal (username, password, tenant)
I already looked at:
- azure.storage.filedatalake
- azure.storage.blob
and authentication using
- azure.identity - TokenCredentials, AzureCliCredential
CodePudding user response:
Which of the hundreds of Azure SDKs to use for this purpose?
Considering ADLS Gen2 is built on top of Blob Storage, you can use both azure.storage.filedatalake
or azure.storage.blob
however the recommendation would be to use azure.storage.filedatalake
as this SDK is designed for ADLS Gen2.
How to authenticate using AAD token? I really prefer authenticating this way, or also using credentials for service principal (username, password, tenant)
Please refer to Authorize access to blobs using Azure Active Directory
regarding how to connect to Azure Storage using Azure AD. The key thing to remember here is that whatever user (even Service Principal) is connecting to Azure Storage using Azure AD credentials, that user must be assigned Azure Storage Data Operations permissions e.g. Blob Data Contributor.
Once you have done that, simply create a credential object and then use that credential object to connect to Azure Storage.
For example, take a look at the code sample below which is taken from here
:
from azure.identity import ClientSecretCredential
token_credential = ClientSecretCredential(
self.active_directory_tenant_id,
self.active_directory_application_id,
self.active_directory_application_secret,
)
datalake_service_client = DataLakeServiceClient("https://{}.dfs.core.windows.net".format(self.account_name),
credential=token_credential)