I would like to generate some UNIQUE random numbers in Snowflake with a specific starting/ending point. I would like for the numbers to start at 1,000 and end at 1,000,000.
Another requirement is joining a string at the beginning of these numbers.
So far I have been using this statement:
SELECT CONCAT('TEST-' , uniform(10000, 99000, RANDOM()));
Which works as expected and gives me the output of e.g. 'TEST-31633'.
However the problem is I am generating these for a large amount of rows, and I need for them to be completely unique.
I have heard of the 'SEQ1' functions however not sure how I could specify a starting point as well as adding a 'TEST-' with the CONCAT function at the beginning. Ideally they won't be in a strict sequence but differ from each other.
Thank You
CodePudding user response:
did you try with UNIFORM, may be this will work for you. https://docs.snowflake.com/en/sql-reference/functions/uniform.html
select CONCAT('TEST-' , uniform(10000, 99000, random())) from table(generator(rowcount => 100));
CodePudding user response:
It's not possible to guarantee uniqueness unless you use some kind of calculated part. For example:
create or replace table testX as
select CONCAT('TEST-' , to_char(seq4(),'0000'),
uniform(100000, 990000, random())) c
from table(generator(rowcount => 1000));
Otherwise, you may try to increase the range to reduce the probability:
create or replace table testX as
select CONCAT('TEST-' , uniform(1000000, 9900000, random())) c
from table(generator(rowcount => 1000));
CodePudding user response:
Any time it's required to assign unique random numbers inside a range, that's best handled through a shuffle rather than generation of new random numbers. Simply generate the sequence, and then shuffle its order.
Depending on the use case, you could simply generate the rows and use order by random()
with a limit on the number selected:
select * from (
select 'TEST-' || (seq4() 10000)::string as MY_TEST_ID
from table(generator(rowcount => 89000))
) order by random() limit 10000
;
A general-purpose solution to this problem is more complex but certainly possible. A JavaScript UDTF can populate an array with all values in the range and shuffle it. Using a window function on the JavaScript UDTF to ensure that the rows are distributed in a single block, it will allow creation of unique random integers in a range in any SQL statement.
First, create the table functions:
create or replace function UNIQUE_RANDOM_INTEGERS(LBOUND float, UBOUND float)
returns table (UNIQUE_RAND_INT float)
language javascript
strict volatile
as
$$
{
initialize: function (argumentInfo, context) {
this.lBound = argumentInfo.LBOUND.constValue;
this.uBound = argumentInfo.UBOUND.constValue;
this.rSpace = this.uBound - this.lBound 1;
if (this.lBound >= this.uBound) throw new Error(">>> LBOUND and UBOUND must be constants and UBOUND must be greater than LBOUND <<<");
if (this.rSpace > 25000000) throw new Error (">>> The difference between LBOUND and UBOUND must be 25,000,000 or less.");
this.rands = new Array(this.rSpace);
this.currentRow = 0;
for (let i = 0; i < this.rands.length; i ) {
this.rands[i] = this.lBound i;
}
this.rands = shuffle(this.rands);
function shuffle(array) {
let currentIndex = array.length, randomIndex;
while (currentIndex != 0) {
randomIndex = Math.floor(Math.random() * currentIndex);
currentIndex--;
[array[currentIndex], array[randomIndex]] = [array[randomIndex], array[currentIndex]];
}
return array;
}
},
processRow: function (row, rowWriter, context) {
//
},
finalize: function (rowWriter, context) {
for (let i = 0; i < this.rSpace; i ) {
rowWriter.writeRow({UNIQUE_RAND_INT:this.rands[i]});
}
},
}
$$;
create or replace function UNIQUE_RANDOM_INTEGERS(LBOUND int, UBOUND int)
returns table(UNIQUE_RAND_INT int)
language sql
as
$$
select UNIQUE_RAND_INT::int from table(UNIQUE_RANDOM_INTEGERS(LBOUND::float, UBOUND::float) over (partition by 1))
$$;
You can then generate unique random numbers in a range using this SQL:
select 'TEST-' || UNIQUE_RAND_INT as MY_RAND_VAL
from table(unique_random_integers(10000, 99000));