Home > Software design >  Shuffle a specific column in a table on BigQuery
Shuffle a specific column in a table on BigQuery

Time:06-01

I have a table that looks like this:

    id label
     1   A
     2   A
     3   A
     4   B
     5   C
     6   C
     7   A
     8   A
     9   C
     10  B

I want to get another column label_shuffled that is the existing column label but shuffled. I need it to be efficient and fast.

Desired output:

    id label  label_shuffled
     1   A         A
     2   A         B
     3   A         C
     4   B         A
     5   C         C
     6   C         A
     7   A         C
     8   A         A
     9   C         B
     10  B         A

Any suggestions?

CodePudding user response:

An option is use window function ROW_NUMBER to enumerate the rows randomly and then join:

WITH suffle AS (
  SELECT
    id,
    label,
    ROW_NUMBER() OVER () row_number,
    ROW_NUMBER() OVER (ORDER BY RAND()) row_number_suffled
  FROM labels
)
SELECT 
  l.id,
  l.label,
  s.label as label_suffled
FROM suffle l 
JOIN suffle s on l.row_number = s.row_number_suffled
  • Related