Home > Blockchain >  Full table scan despite LIMIT clause
Full table scan despite LIMIT clause

Time:12-22

I'm having troubles running this simple query SELECT * FROM my_view LIMIT 10 in Amazon Redshift. View defined like this

CREATE VIEW my_view AS
SELECT * FROM my_spectrum_table
UNION ALL
SELECT * FROM ordinary_table
UNION ALL
SELECT * FROM some_materialized_view;

The problem is that query tries to scan several terabytes of data in the underlying my_spectrum_table and eventually dies from disk full error. When filtering data using WHERE clause query works just fine.

CodePudding user response:

LIMIT reduces the size of the resulting data. It CAN cause steps to be aborted in the query IF the resulting data size is met but if there are enough steps in the query this can only happen AFTER all the input data is scanned. LIMIT is generally used to reduce the amount of data sent back to the requestor.

A WHERE clause reduces the source data that is used and if the block metadata and/or the Spectrum partitioning allows CAN also reduce the amount that is scanned. If you want to reduce scan, WHERE is the way to go.

CodePudding user response:

That's how view works: it gets re-calculated every time you use it. If you want only 10 records create a query that reflects it, and use that query.

Another option is to create a materialized view, but then you should refresh it once in a while (otherwise you'll get stale data).

  • Related