Home > Enterprise >  Azure ADLS Gen2 File read using Python (without ADB)
Azure ADLS Gen2 File read using Python (without ADB)

Time:07-01

Want to read files(csv or json) from ADLS gen2 Azure storage using python(without ADB) .

file  = DataLakeFileClient.from_connection_string(conn_str=conn_string,file_system_name="test", file_path="source")

with open("./test.csv", "r") as my_file:
    file_data = file.read_file(stream=my_file)

Error : Exception has occurred: AttributeError 'DataLakeFileClient' object has no attribute 'read_file'

My try is to read csv files from ADLS gen2 and convert them into json. Download.readall() is also throwing the ValueError: This pipeline didn't have the RawDeserializer policy; can't deserialize.

CodePudding user response:

Try the below piece of code and see if it resolves the error:

import os, uuid, sys
from azure.storage.filedatalake import DataLakeServiceClient

service_client = DataLakeServiceClient.from_connection_string("DefaultEndpointsProtocol=https;AccountName=***;AccountKey=*****;EndpointSuffix=core.windows.net")

file_system_client = service_client.get_file_system_client(file_system="test")

directory_client = file_system_client.get_directory_client("testdirectory")

file_client = directory_client.get_file_client("test.txt")

download=file_client.download_file()

downloaded_bytes = download.readall()

with open("./sample.txt", "wb") as my_file:
    my_file.write(downloaded_bytes)
    my_file.close()

Also, please refer to this Use Python to manage directories and files MSFT doc for more information.

  • Related