Home > Net >  Big Query -- Reorder elements within a delimited string by another delimiter
Big Query -- Reorder elements within a delimited string by another delimiter

Time:07-19

Summary

I'd like to reorder elements in a string, the elements are delimited by new lines.

The elements I'd like to sort should be ordered by a string that can have numbers or letters within it. This sorting string is not at the beginning of the data, but rather it is also a delimited string (messy data set, I know). To make this even messier, there is an extra new line; this doesn't seem like the crux of this issue

Example

Below is a simplified version of what I'd like to do. I have a table, and I'd like to sort students' favorite shows and characters by the show's name, which is the second element of a pipe-delimited string.

student favorite characters and shows
alice 10th doctor | dr who
troy | community
bob 11 | stranger things
Liz | 30 Rock
mr peanut butter | bojack horseman

would become this:

student favorite characters and shows
alice troy | community
10th doctor | dr who
bob Liz | 30 Rock
mr peanut butter | bojack horseman
11 | stranger things

What I've tried

Big Query doesn't allow arrays of arrays. If it did, I would have an easier time here. I've tried working with COLLATE but today is my first time seeing that function; I'm not sure that is the right way to go, anyways.

Currently, I'm working to split by new line, and rejoin later. I have never done this with tables, so I'm a bit out of my element. Here is the query I'm working from:

WITH
  -- example data from above
  example_data AS (
  SELECT
    'alice' AS student,
    -- note: the new line is at the end of every pipe-delimited line, so there is always some floating empty row when using functions like split()
    '10th doctor | dr who\ntroy | community\n' AS favorite_characters_and_shows
  UNION ALL
  SELECT
    'bob' AS student,
    "11 | stranger things\nLiz | 30 Rock\nmr peanut butter | bojack horseman\n" AS favorite_characters_and_shows ),
  -- I have no need for this to be another table, but it is where I am. Tell me if this is misguided, please.
  soln_table AS (
  SELECT
    example_data.student,
    example_data.favorite_characters_and_shows,
    SPLIT(example_data.favorite_characters_and_shows, '\n'),
    array( select x from unnest(SPLIT(example_data.favorite_characters_and_shows, '\n') ) as x order by x) as foo,
  FROM
    example_data )
  -- where I am trying to display a sorted solution
SELECT
  *
FROM
  soln_table;

CodePudding user response:

Consider below approach

select student, (
    select string_agg(line, '\n' order by split(line, '|')[safe_offset(1)])
    from unnest(split(favorite_characters_and_shows, '\n')) line
    where trim(line) != ''
  ) as favorite_characters_and_shows
from example_data       

if applied to sample data in your question - output is

enter image description here

  • Related