How to transpose data grouped by two fields-CodePudding

I am having issues figuring out wow I can transform my data from the example table to the desired results? The main idea is to group the rows by id_1 and id_2 and then transform the data into one row with the order by sequence_id. Any help or tips would be appreciated, thanks!

Example data:

date	id_1	id_2	sequence_id	data_1	data_2	data_3
2020-01-01	ABC	123	2	hi	nice	to
2020-01-01	ABC	123	3	meet	you	my
2020-01-01	ABC	123	4	name	is	bob
2020-02-01	DEF	456	1	good	day	sir
2020-02-01	DEF	456	3	how	are	you

Desired output:

date	id_1	id_2	sequence_id	data_1	data_2	data_3	data_1	data_2	data_3	data_1	data_2	data_3
2020-01-01	ABC	123	2	hi	nice	to	meet	you	my	name	is	bob
2020-02-01	DEF	456	1	good	day	sir	how	are	you

CodePudding user response：

We cannot use the same name for more than one column in a table.Hence your desired output is not feasible(i.e data_1/2/3 as column name more than once).

If your goal is to have the full sentence for each id in each row then as an alternative you can consider the below query:

with cte as (
 select "2020-01-01" date,"ABC" id_1,"123"id_2,"2"sequence,"hi" data_1,"nice" data_2,"to" data_3 union all
select "2020-01-01","ABC","123","3","meet","you","my" union all
select "2020-01-01","ABC","123","4","name","is","bob" union all
select "2020-02-01","DEF","456","1","good","day","sir" union all
select "2020-02-01","DEF","456","3","how","are","you"
)
 
 
select date,id_1,id_2,sequence
 ,STRING_AGG(concat(data_1," ",data_2," ",data_3)," ")over(partition by date,id_1,id_2 order by sequence)str
from cte
qualify row_number() over(partition by date,id_1,id_2 order by sequence desc)=1

CodePudding user response：

Consider below approach (could be good starting point for you to further optimize it)

select * from(
  select date, id_1, id_2, min(sequence_id) over win as sequence_id, 
    data, row_number() over win pos
  from your_table, unnest([data_1, data_2, data_3]) data with offset
  window win as (partition by date, id_1, id_2 order by sequence_id, offset)
)
pivot (any_value(data) as data_ for pos in (1,2,3,4,5,6,7,8,9))

if applied to sample data in your question - output is