I am having a bit of a trouble with Synapse notebooks. I want to get a list of blob via pyspark script to dynamically decide which files I want to integrate. I cannot make this thing work in Synapse.. in other environment such as Jupyter notebook the code is working as expected.
from azure.storage.blob import ContainerClient, BlobServiceClient,AccountSasPermissions, ResourceTypes from azure.storage.blob._shared_access_signature import SharedAccessSignature,BlobSharedAccessSignature
sas_token = 'hardcoded_value'
account_url1 = 'https://storage_account.blob.core.windows.net/container' sas_token
print(account_url1) container_client = ContainerClient.from_container_url(container_url=account_url1) source_blob_list = container_client.list_blobs() for blob in source_blob_list: print (blob.name '\n')
The output from the code above in Synapse is:
ServiceRequestError: <urllib3.connection.HTTPSConnection object at 0x7f282242e130>: Failed to establish a new connection: [Errno -2] Name or service not known
The output from the code above in Jupyter notebook is as expected..
For more information refer this MS document
CodePudding user response:
In the end was permissions to the managed identity of Synapse... The code above was working as I stated outside of Synapse. Now when We added permissions to the managed private endpoint of Synapse everything is working. Thank you!