I am trying to output all the schemas, tables, columns, distinct column value, and count of each distinct value to a new table.
I found the first three columns (schema, table, column) from the information.schema_columns
, however the last two columns are not as easy.
I've tried to use cursor to output the distinct column value and the count of each distinct value, but only managed to find the total number of distinct values per column.
However, I need to group by each value and to output the value itself and the count of each value.
For example:
SchemaName | TableName | ColumnName | ColumnGrouppedValue | ValueCount |
---|---|---|---|---|
dbo | Table1 | Column1 | Value1 | 5 |
dbo | Table1 | Column1 | Value2 | 2 |
dbo | Table1 | Column2 | Value1 | 1 |
dbo | Table1 | Column2 | Value2 | 26 |
dbo | Table2 | Column1 | Value1 | 10 |
dbo | Table2 | Column1 | Value2 | 8 |
Here is the code I used, but as mentioned earlier it doesn't output exactly what I need.
DECLARE @table_name AS NVARCHAR(128);
DECLARE @schema_name AS NVARCHAR(128);
DECLARE @column_name AS NVARCHAR(128);
DECLARE @distinct_count int;
DECLARE @sql AS NVARCHAR(MAX);
DECLARE tables_cursor CURSOR FOR
SELECT table_schema, table_name, column_name
FROM information_schema.columns
ORDER BY TABLE_SCHEMA, table_name, column_name;
OPEN tables_cursor;
FETCH NEXT FROM tables_cursor INTO @schema_name, @table_name, @column_name;
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql = N'SELECT @count = COUNT(DISTINCT ' @column_name N') FROM ' @schema_name '.' @table_name;
EXEC sp_executesql @sql, N'@count INT OUTPUT', @count = @distinct_count OUTPUT;
IF @schema_name <> 'sys' AND @table_name <> 'Tool1'
BEGIN
SELECT @schema_name AS tableschema_name, @table_name AS table_name, @column_name AS column_name, @distinct_count AS distinct_count;
INSERT INTO dbo.Tool1 (SchemaName, TableName, ColumnName, [Count])
SELECT
@schema_name AS SchemaName
,@table_name AS TableName
,@column_name AS ColumnName
,@distinct_count AS [Count];
END
FETCH NEXT FROM tables_cursor INTO @schema_name, @table_name, @column_name;
END
CLOSE tables_cursor;
DEALLOCATE tables_cursor;
CodePudding user response:
Simplified with :
DECLARE @SQL NVARCHAR(max) = '';
SELECT @SQL = @SQL
'SELECT ''' TABLE_SCHEMA ''' AS SCHEMA_NAME, '''
TABLE_NAME ''' AS TABLE_NAME, '''
COLUMN_NAME ''' AS COL_NAME, '''
DATA_TYPE ''' AS DATA_TYPE, CAST(['
COLUMN_NAME '] AS NVARCHAR(4000)) AS [VALUE], COUNT(*) AS COUNT_VALUE FROM ['
TABLE_SCHEMA '].[' TABLE_NAME
'] GROUP BY [' COLUMN_NAME '] UNION ALL '
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE NOT IN ('text', 'ntext', 'image')
AND CHARACTER_MAXIMUM_LENGTH >= 0;
SET @SQL = LEFT(@SQL, LEN(@SQL) - 10);
EXEC (@SQL);
CodePudding user response:
This code by RoughPlace does exactly what I was looking for!
create Table #Temp
(tableName varchar(100),
columnName varchar(100),
value varchar(1000),
distinctItems int)
Declare @tabName as varchar(100)
Declare @colName as varchar(100)
Declare @tabid as int
Declare cursorTables Cursor
for
select t.object_id , t.name , c.name from sys.tables t inner join sys.columns c on t.object_id = c.object_id
open cursorTables
Fetch Next from cursorTables into
@tabid,@tabName,@colName
while @@Fetch_Status = 0
Begin
declare @query as nVarchar(1000)
set @query = 'Insert into #Temp SELECT ''' @tabName ''' , ''' @colName ''', ' @colName ', COUNT([' @colName ']) AS Expr1 FROM [' @tabName '] group by [' @colName ']'
print @query
exec sp_executesql @query
Fetch Next from cursorTables into
@tabid,@tabName,@colName
End
Close cursorTables
Deallocate cursorTables
select * from #temp
drop table #temp