Summary
I'd like to reorder elements in a string, the elements are delimited by new lines.
The elements I'd like to sort should be ordered by a string that can have numbers or letters within it. This sorting string is not at the beginning of the data, but rather it is also a delimited string (messy data set, I know). To make this even messier, there is an extra new line; this doesn't seem like the crux of this issue
Example
Below is a simplified version of what I'd like to do. I have a table, and I'd like to sort students' favorite shows and characters by the show's name, which is the second element of a pipe-delimited string.
student | favorite characters and shows |
---|---|
alice | 10th doctor | dr who troy | community |
bob | 11 | stranger things Liz | 30 Rock mr peanut butter | bojack horseman |
would become this:
student | favorite characters and shows |
---|---|
alice | troy | community 10th doctor | dr who |
bob | Liz | 30 Rock mr peanut butter | bojack horseman 11 | stranger things |
What I've tried
Big Query doesn't allow arrays of arrays. If it did, I would have an easier time here. I've tried working with COLLATE
but today is my first time seeing that function; I'm not sure that is the right way to go, anyways.
Currently, I'm working to split by new line, and rejoin later. I have never done this with tables, so I'm a bit out of my element. Here is the query I'm working from:
WITH
-- example data from above
example_data AS (
SELECT
'alice' AS student,
-- note: the new line is at the end of every pipe-delimited line, so there is always some floating empty row when using functions like split()
'10th doctor | dr who\ntroy | community\n' AS favorite_characters_and_shows
UNION ALL
SELECT
'bob' AS student,
"11 | stranger things\nLiz | 30 Rock\nmr peanut butter | bojack horseman\n" AS favorite_characters_and_shows ),
-- I have no need for this to be another table, but it is where I am. Tell me if this is misguided, please.
soln_table AS (
SELECT
example_data.student,
example_data.favorite_characters_and_shows,
SPLIT(example_data.favorite_characters_and_shows, '\n'),
array( select x from unnest(SPLIT(example_data.favorite_characters_and_shows, '\n') ) as x order by x) as foo,
FROM
example_data )
-- where I am trying to display a sorted solution
SELECT
*
FROM
soln_table;
CodePudding user response:
Consider below approach
select student, (
select string_agg(line, '\n' order by split(line, '|')[safe_offset(1)])
from unnest(split(favorite_characters_and_shows, '\n')) line
where trim(line) != ''
) as favorite_characters_and_shows
from example_data
if applied to sample data in your question - output is