I need to list all of the columns (including schema and table) used in a SQL query.
The reason I am doing this is because I need to document all of the required data that is being used by a shortlist of existing Power BI reports, stored procedures and views. I am then doing a gap analysis to see what data is already available in a new environment and documenting what data needs to be migrated in order to be able to reproduce the reports, stored procedures and views that are in the shortlist.
Input
SELECT FullName, EmailAddress
FROM dbo.Customers t1
LEFT JOIN dbo.Customer_EmailAddress t2 ON t1.CustomerID = t2.CustomerID
Output should be:
dbo.Customers.CustomerID
dbo.Customers.FirstName
dbo.Customer_EmailAddress.CustomerID
dbo.Customer_EmailAddress.EmailAddress
Does anyone know of a way to do this? I need to do this for a large number of queries and I am currently doing it manually which is very time-consuming!
CodePudding user response:
Once you have a query, you can get an execution plan in XML format, E.g. by using the following code.
SET SHOWPLAN_XML ON;
GO
-- your query here
select ...
GO
SET SHOWPLAN_XML OFF;
GO
The execution plan contains ColumnReference elements which hold the data that you need (database, schema, table & column), E.g.:
<ColumnReference Database="[mydb]" Schema="[sys]" Table="[sysschobjs]" Alias="[o]" Column="id">
Insert the XML that you got into a table, E.g.
create table ExecutionPlan(id int primary key,execution_plan xml)
-- cannot insert the real XML here due to SO post length limitations
insert into ExecutionPlan values (1, '<ShowPlanXML ...');
insert into ExecutionPlan values (2, '<ShowPlanXML ...');
insert into ExecutionPlan values (3, '<ShowPlanXML ...');
When done with the inserts, use the following query to extract the columns
with xmlnamespaces('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as ns)
select distinct
ep.id
,e.val.value ('@Database' ,'sysname') as db
,e.val.value ('@Schema' ,'sysname') as scm
,e.val.value ('@Table' ,'sysname') as tab
,e.val.value ('@Column' ,'sysname') as col
from ExecutionPlan as ep
cross apply execution_plan.nodes ('//ns:ColumnReference') as e (val)
where e.val.value ('@Database' ,'sysname') is not null
order by id, db, scm, tab, col
---- ----------------------- ------- -------------------- ----------
| id | db | scm | tab | col |
---- ----------------------- ------- -------------------- ----------
| 1 | [mydb] | [dbo] | [mytable] | i |
| 1 | [mydb] | [dbo] | [mytable] | v |
| 2 | [mydb] | [dbo] | [mytable] | i |
| 2 | [mydb] | [dbo] | [mytable] | v |
| 3 | [mssqlsystemresource] | [sys] | [syspalnames] | class |
| 3 | [mssqlsystemresource] | [sys] | [syspalnames] | name |
| 3 | [mssqlsystemresource] | [sys] | [syspalnames] | value |
| 3 | [mssqlsystemresource] | [sys] | [syspalvalues] | class |
| 3 | [mssqlsystemresource] | [sys] | [syspalvalues] | name |
| 3 | [mssqlsystemresource] | [sys] | [syspalvalues] | value |
| 3 | [mydb] | [sys] | [sysidxstats] | id |
| 3 | [mydb] | [sys] | [sysidxstats] | indid |
| 3 | [mydb] | [sys] | [sysidxstats] | intprop |
| 3 | [mydb] | [sys] | [sysmultiobjrefs] | class |
| 3 | [mydb] | [sys] | [sysmultiobjrefs] | depid |
| 3 | [mydb] | [sys] | [sysobjvalues] | objid |
| 3 | [mydb] | [sys] | [sysobjvalues] | valclass |
| 3 | [mydb] | [sys] | [sysobjvalues] | valnum |
| 3 | [mydb] | [sys] | [sysobjvalues] | value |
| 3 | [mydb] | [sys] | [sysschobjs] | created |
| 3 | [mydb] | [sys] | [sysschobjs] | id |
| 3 | [mydb] | [sys] | [sysschobjs] | intprop |
| 3 | [mydb] | [sys] | [sysschobjs] | modified |
| 3 | [mydb] | [sys] | [sysschobjs] | name |
| 3 | [mydb] | [sys] | [sysschobjs] | nsclass |
| 3 | [mydb] | [sys] | [sysschobjs] | nsid |
| 3 | [mydb] | [sys] | [sysschobjs] | pclass |
| 3 | [mydb] | [sys] | [sysschobjs] | pid |
| 3 | [mydb] | [sys] | [sysschobjs] | status |
| 3 | [mydb] | [sys] | [sysschobjs] | status2 |
| 3 | [mydb] | [sys] | [sysschobjs] | type |
| 3 | [mydb] | [sys] | [syssingleobjrefs] | class |
| 3 | [mydb] | [sys] | [syssingleobjrefs] | depid |
| 3 | [mydb] | [sys] | [syssingleobjrefs] | depsubid |
| 3 | [mydb] | [sys] | [syssingleobjrefs] | indepid |
---- ----------------------- ------- -------------------- ----------
P.S.
You would probably want to automate the process by writing a code (e.g. in Python) that connect to your database, pick a query, get its execution plan and save it into a table.
CodePudding user response:
You can use sp_describe_first_result_set
or sys.dm_exec_describe_first_result_set
for this.
For example:
create table t(id int, v varchar(30))
select
name as ActualColumnName,
concat(
QUOTENAME(s.source_database),
'.',
QUOTENAME(s.source_schema),
'.',
QUOTENAME(s.source_table),
'.')
QUOTENAME(s.source_column) AS SourceName
from sys.dm_exec_describe_first_result_set(N'
SELECT id, v
FROM dbo.t
', N'', 1) s;
The second parameter is for variables. For example:
select
name as ActualColumnName,
concat(
QUOTENAME(s.source_database),
'.',
QUOTENAME(s.source_schema),
'.',
QUOTENAME(s.source_table),
'.')
QUOTENAME(s.source_column) AS SourceName
from sys.dm_exec_describe_first_result_set(N'
SELECT id, v, @v
FROM dbo.t
', N'@v varchar(100)', 1) s
For a list of every reference used in a query, you can use sys.dm_sql_referenced_entities
. It only works on stored procedures though.
CodePudding user response:
Try following code:
select sc.TABLE_SCHEMA N'.' o.name N'.' c.name from sys.all_objects o
inner join sys.all_columns c on o.object_id = c.object_id
inner join INFORMATION_SCHEMA.TABLES sc on o.name = sc.TABLE_NAME
where type = N'U'
With above SQL Server Query you can get the list of columns as you want:
dbo.Customers.CustomerID
then you can filter your table and columns to reach your needed columns.