Home > database >  Connecting to Azure Synapse Serverless SQL using Python
Connecting to Azure Synapse Serverless SQL using Python

Time:09-06

I am trying to connect to Azure Serverless SQL using python but getting the following error

External table 'dbo.my_table' is not accessible because location does not exist or it is used by another process.

I have created the External table using following commands.

First, I created a database scope credential

CREATE DATABASE SCOPED CREDENTIAL python_access_credential WITH IDENTITY = 'SHARED ACCESS SIGNATURE', SECRET =  '<SAS TOKEN>';

Then I created the external data source with the above database scope credential using following command

CREATE EXTERNAL DATA SOURCE python_data WITH ( LOCATION ='abfss://<StorageAccount>.dfs.core.windows.net', CREDENTIAL= [python_access_credential]);

Then I created the external table

CREATE EXTERNAL TABLE my_table ([C1] bigint, [C2] nvarchar(4000))WITH (LOCATION ='data.tsv', DATA_SOURCE = [python_data],FILE_FORMAT = [SynapseDelimitedTextFormat])

Then, I ran the following command to grant access to the user used in my python script

GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[python_access_credential] TO [user];

Also, same error occurs when I try to access the table via Synapse Studio

I had referred to many Azure documents, Microsoft Q&A pages and one other stack overflow question which I enter image description here

There are two types of workspace databases you can create. If you are not using spark SQL, select the first type.

enter image description here

Click the plus icon and add a SQL database called [stack_overflow] to the serverless pools. This type of pool is great since you only get charged when you query the data.

enter image description here

We can see from the above image that are new database, [stack_overflow] now exists.

enter image description here

If you are familiar with Azure Data Factory, configure a linked service that connects to your data storage and browse to the file that you want to convert into a table. Right click the "create external table" menu option.

Please see MSDN article for "how to create a linked service".

enter image description here

It will ask you what is the default text field size - 4000 bytes.

enter image description here

Next, choose to create and open a script. There is a bug in the GUI interface. While, it states you can create a new schema and new table with in the data entry field, it does not create the code for the schema.

enter image description here

Just add lines 13/14 and change 16 to prefix the table with the schema called [sales_lt]. Please note, this creates the code that you showed above w.o any real work on your part.

enter image description here

We need to add an AD user to the synapse workspace and give it rights to view the table. I am going to start with sysadmin rights. You can reduce the rights once you have it working.

enter image description here

Find the end point for the Synapse Server-less Pools. In short, the service looks like and acts like SQL server with the tabular data stream (TDS) as the format of the data.

We can use the latest driver from Microsoft and write code for python to connect to the database. We need to use Active Directory with password option. Synapse does not support non-active directory users.

Please see the MSDN documentation for how to download latest driver (native odbc).

enter image description here

Enter the fully qualified address to the Synapse Server-less Pool. Add the user, [email protected] and the correct password.

enter image description here

The advance tab allows you to make connection changes. Make sure TCP/IP is used as the network protocol and the default database is [stack_overflow].

enter image description here

Last but not least, the select statement retrieves data from the Synapse Server-less pool and returns it to the client, SSMS. In short, the solution works. Now, we just need to code the connection and data retrieval using python.

Here is a link on how to use the latest ODBC driver for SQL Server for Python.

https://datatofish.com/how-to-connect-python-to-sql-server-using-pyodbc/

Please remember, server-less pools provide read access to file in the data lake. These files look like tables in a database. No write options are supported with this service.

CodePudding user response:

I was able to solve the problem by using Managed Identity credentials instead of Shared Access Signature. Ran the following code to create the credential and external table

Create a master key first.

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<your_password_1>'

Now create a Managed Identity database scoped credential

CREATE DATABASE SCOPED CREDENTIAL <your_credential_name> WITH IDENTITY = 'Managed Identity'

Create a login user along with password. You will use it in your python script to run queries

CREATE LOGIN <your_login> WITH PASSWORD = '<your_login_password>'

The above login is to be used as UID in python script. Now create user

CREATE USER <your_user> FOR LOGIN <your_login>

GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::<your_credential_name> TO<your_user>

Now create an external data source with the above Credentials and use that data source to create an external table

IF NOT EXISTS (SELECT * FROM sys.external_file_formats WHERE name = '<FileFormatName>') 
CREATE EXTERNAL FILE FORMAT [<FileFormatName>] 
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
       FORMAT_OPTIONS (
         FIELD_TERMINATOR = ',',
         USE_TYPE_DEFAULT = FALSE
        ))
GO

IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = '<ExternalDataSourceName>') 
CREATE EXTERNAL DATA SOURCE [<ExternalDataSourceName>] 
WITH (
    CREDENTIAL = <your_credential_name>,
    LOCATION = 'abfss://<container>@<storage-account>.dfs.core.windows.net' 
)
GO

CREATE EXTERNAL TABLE [<table_name>] (
[C1] nvarchar(4000),
[C2] nvarchar(4000),
[C3] bigint,
[C4] bigint
)
WITH (
LOCATION = <Location To File in Container>,
DATA_SOURCE = [<ExternalDataSourceName>],
FILE_FORMAT = [<FileFormatName>]
)
GO

Now, final step, grant Select access

GRANT SELECT ON OBJECT::[dbo].[<table_name>] TO [<your_user>]
  • Related