Home > front end >  How to get a list of tables used for each query in the query history in snowflake
How to get a list of tables used for each query in the query history in snowflake

Time:01-05

I am doing analysis on table usage. I need to use the query history table and take out all tables that were used in a query. The query itself could use 10 separate tables, or even use the same table multiple times. I need to be able to parse through the query (using a query itself, not python or any other language, just SQL) and take out a list of all tables that the query hit.

Is there any way to do this?

An example of where I am getting the histories would be this query:

select query_text
from table(information_schema.query_history())

CodePudding user response:

The ACCESS_HISTORY view will tell you what tables are used in a query:

https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html

That's an enterprise level feature. You could also run EXPLAIN on any query:

SELECT *, "objects"
FROM TABLE(EXPLAIN_JSON(SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.any_table_or_view')))
WHERE "operation"='TableScan'

See more at https://stackoverflow.com/a/64343564/132438

CodePudding user response:

An alternative approach using rlike and information_schema.tables.

You could extend this further by looking at the # rows per table (high = fact, low = dimension) and the number of times accessed.

select query_text, array_agg(DISTINCT TABLE_NAME::string) 
from 
 (select top 100 query_text 
  from 
     table(information_schema.query_history())
  where 
     EXECUTION_STATUS = 'SUCCESS' ) a
 left outer join 
 (select TABLE_NAME from INFORMATION_SCHEMA.TABLES group by TABLE_NAME) b
 on 
    upper(a.query_text) rlike '.*('||upper(b.table_name)||').*'
group by 
   query_text
  •  Tags:  
  • Related