Home > Software design >  How to sample 10% of the data from a table?
How to sample 10% of the data from a table?

Time:10-17

How can I return 10% of the rows from a table in Snowflake. For example, how can I return the 10th, 20th, 30th row based after sorting a table.

CodePudding user response:

Using QUALIFY and modulo division:

SELECT *
FROM tab
QUALIFY ROW_NUMBER() OVER(ORDER BY some_column) % 10 = 0;

CodePudding user response:

I would suggest making use of modulo arithmetic and the ROW_NUMBER window function:

SELECT * FROM (
  SELECT *, ROW_NUMBER() OVER(ORDER BY CC_REC_END_DATE) as rownum
  FROM "SNOWFLAKE_SAMPLE_DATA"."TPCDS_SF100TCL"."CALL_CENTER"
) t
WHERE t.rownum % 10 = 0

CodePudding user response:

If I understand your question right, I think you're looking for SAMPLE.

From the Snowflake docs: Fraction-based Row Sampling - Return a sample of a table in which each row has a 10% probability of being included in the sample:

select * from testtable sample (10);
  • Related