Home > Enterprise >  Azure ADLS Gen2 File read using Python (without ADB)
Azure ADLS Gen2 File read using Python (without ADB)


Want to read files(csv or json) from ADLS gen2 Azure storage using python(without ADB) .

file  = DataLakeFileClient.from_connection_string(conn_str=conn_string,file_system_name="test", file_path="source")

with open("./test.csv", "r") as my_file:
    file_data = file.read_file(stream=my_file)

Error : Exception has occurred: AttributeError 'DataLakeFileClient' object has no attribute 'read_file'

My try is to read csv files from ADLS gen2 and convert them into json. Download.readall() is also throwing the ValueError: This pipeline didn't have the RawDeserializer policy; can't deserialize.

CodePudding user response:

Try the below piece of code and see if it resolves the error:

import os, uuid, sys
from azure.storage.filedatalake import DataLakeServiceClient

service_client = DataLakeServiceClient.from_connection_string("DefaultEndpointsProtocol=https;AccountName=***;AccountKey=*****;EndpointSuffix=core.windows.net")

file_system_client = service_client.get_file_system_client(file_system="test")

directory_client = file_system_client.get_directory_client("testdirectory")

file_client = directory_client.get_file_client("test.txt")


downloaded_bytes = download.readall()

with open("./sample.txt", "wb") as my_file:

Also, please refer to this Use Python to manage directories and files MSFT doc for more information.

  • Related