Randomly selecting rows and updating another table per row in Postgres-CodePudding

I'm using postgres 10 and I'm looking to randomise some data.

I start by creating a temporary table and fill it with 1,000 rows of random data.

I then want to merge that into another table that may have less or more rows than the random data.

For each row in my dimension table I want to select a random row from the random data in the temporary table, setting the values in the dimension table to the randomly selected rows values in the temporary table.

eg.

I have a table called reference.tv_shows with the fields Name and Category.

I have a temporary table called random_tv_shows with the fields Name and Category. This data is completely random and consists of 1,000 rows.

I want to go through EACH row in the reference.tv_shows and pick a random row in the random_tv_shows table and set the reference.tv_shows Name and Category to be that of the selected row in random_tv_shows.

I tried running a fairly simple select but it looks as though it evaluates itself once then updates (Or maybe RANDOM() is only random once per TX?).

UPDATE reference.tv_shows SET "Name" = (SELECT "Name" FROM random_tv_shows ORDER BY RANDOM() LIMIT 1)

Is there a way to do this in postgres?

CodePudding user response：

When I have a test table, with the field a which is an integer,

If I do this:

update test set a=random()*1000;

If wil get random values for every record in my table.

But when I do this:

update test set a=(select random()*1000);

All values for a will be the same.

This is shown in this DBFIDDLE

Because, when updating the table reference.tv_shows, you only want 1 tv_show to be updated, you need to have a unique identifier for every tv_show. currently that info is not available in the question.

EDIT: I tried to reproduce your data (less records, and lack of imagination on categories, but...