Home > Software design >  Snowflake : Re-running the exact same query twice (same conditions) returns different results (diffe
Snowflake : Re-running the exact same query twice (same conditions) returns different results (diffe

Time:05-17

Snowflake : Re-running the exact same query twice (same conditions) returns different results (different number of rows) despite not using cached data (ALTER SESSION SET USE_CACHED_RESULT = FALSE;), do you please know what could be the issue ?

Thanks for your help

CodePudding user response:

If you have some form of ranking function RANK, LIMIT, ROW_NUMBER, TOP these will sort the data, and if there are rows with the "same values in the ORDER BY" clause. Then the row returned by these functions is called unstable.

Thus if you are then joining these sub-results to other tables, you might get rather large changes because the two unstable rows, might join to other rows with wildly different foreign keys relationships.

The next class of instability is numerical, floating point numbers are not "great" thus if you order by values that are VERY close, their order can swap. And if you are summing floats, and then selecting based of "being over a threshold" value, the order you "math" floats can impact the result.

Another reason is the underlying data is changing, if you are selecting from a table with background insertion occurring on it. "same results" should not be expected.

Those are the top of my head reasons, to describe what you witness.

CodePudding user response:

A query can return different results due to varying factors. A few of them are provided below.

  1. A query involves functions that must be evaluated at execution time like CURRENT_TIMESTAMP(), UDF, external functions, etc.,
  2. Any configuration options that will affect how the result was produced have changed between the runs.
  3. A table has been reclustered or consolidated due to changes to other data in the table if applicable.

Reaching out to Snowflake support can help you get more insights on this as the actual queries can be analyzed for the reason behind

  • Related