I'm using postgres 10 and I'm looking to randomise some data.
I start by creating a temporary table and fill it with 1,000 rows of random data.
I then want to merge that into another table that may have less or more rows than the random data.
For each row in my dimension table I want to select a random row from the random data in the temporary table, setting the values in the dimension table to the randomly selected rows values in the temporary table.
eg.
I have a table called reference.tv_shows with the fields Name and Category.
I have a temporary table called random_tv_shows with the fields Name and Category. This data is completely random and consists of 1,000 rows.
I want to go through EACH row in the reference.tv_shows and pick a random row in the random_tv_shows table and set the reference.tv_shows Name and Category to be that of the selected row in random_tv_shows.
I tried running a fairly simple select but it looks as though it evaluates itself once then updates (Or maybe RANDOM() is only random once per TX?).
UPDATE reference.tv_shows SET "Name" = (SELECT "Name" FROM random_tv_shows ORDER BY RANDOM() LIMIT 1)
Is there a way to do this in postgres?
CodePudding user response:
When I have a test
table, with the field a
which is an integer,
If I do this:
update test set a=random()*1000;
If wil get random values for every record in my table.
But when I do this:
update test set a=(select random()*1000);
All values for a
will be the same.
This is shown in this DBFIDDLE
Because, when updating the table reference.tv_shows
, you only want 1 tv_show to be updated, you need to have a unique identifier for every tv_show. currently that info is not available in the question.
EDIT: I tried to reproduce your data (less records, and lack of imagination on categories, but...