I am having issues figuring out wow I can transform my data from the example table to the desired results? The main idea is to group the rows by id_1 and id_2 and then transform the data into one row with the order by sequence_id. Any help or tips would be appreciated, thanks!
Example data:
date | id_1 | id_2 | sequence_id | data_1 | data_2 | data_3 |
---|---|---|---|---|---|---|
2020-01-01 | ABC | 123 | 2 | hi | nice | to |
2020-01-01 | ABC | 123 | 3 | meet | you | my |
2020-01-01 | ABC | 123 | 4 | name | is | bob |
2020-02-01 | DEF | 456 | 1 | good | day | sir |
2020-02-01 | DEF | 456 | 3 | how | are | you |
Desired output:
date | id_1 | id_2 | sequence_id | data_1 | data_2 | data_3 | data_1 | data_2 | data_3 | data_1 | data_2 | data_3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2020-01-01 | ABC | 123 | 2 | hi | nice | to | meet | you | my | name | is | bob |
2020-02-01 | DEF | 456 | 1 | good | day | sir | how | are | you |
CodePudding user response:
We cannot use the same name for more than one column in a table.Hence your desired output is not feasible(i.e data_1/2/3 as column name more than once).
If your goal is to have the full sentence for each id in each row then as an alternative you can consider the below query:
with cte as (
select "2020-01-01" date,"ABC" id_1,"123"id_2,"2"sequence,"hi" data_1,"nice" data_2,"to" data_3 union all
select "2020-01-01","ABC","123","3","meet","you","my" union all
select "2020-01-01","ABC","123","4","name","is","bob" union all
select "2020-02-01","DEF","456","1","good","day","sir" union all
select "2020-02-01","DEF","456","3","how","are","you"
)
select date,id_1,id_2,sequence
,STRING_AGG(concat(data_1," ",data_2," ",data_3)," ")over(partition by date,id_1,id_2 order by sequence)str
from cte
qualify row_number() over(partition by date,id_1,id_2 order by sequence desc)=1
CodePudding user response:
Consider below approach (could be good starting point for you to further optimize it)
select * from(
select date, id_1, id_2, min(sequence_id) over win as sequence_id,
data, row_number() over win pos
from your_table, unnest([data_1, data_2, data_3]) data with offset
window win as (partition by date, id_1, id_2 order by sequence_id, offset)
)
pivot (any_value(data) as data_ for pos in (1,2,3,4,5,6,7,8,9))
if applied to sample data in your question - output is