Via SQL, I'm trying to get from this:
id | group_id | session_id | field_label | field_value | sent_at |
---|---|---|---|---|---|
1 | frosted flakes | blue bowl | first_name | Bob | 2022-11-05 18:18:19.093 |
2 | frosted flakes | blue bowl | first_name | Bobby | 2022-11-05 18:17:31.274 |
3 | frosted flakes | blue bowl | last_name | Brown | 2022-11-05 18:17:16.241 |
4 | frosted flakes | blue bowl | last_name | Browning | 2022-11-05 18:15:34.492 |
5 | frosted flakes | blue bowl | last_name | Brownson | 2022-11-05 18:14:58.465 |
6 | cheerios | green cup | first_name | Christine | 2022-11-05 18:18:58.222 |
7 | cheerios | green cup | last_name | Christmas | 2022-11-05 18:20:41.212 |
8 | cheerios | green cup | last_name | Christopherson | 2022-11-05 18:24:58.222 |
where
id
is uniquegroup_id
is not uniquesession_id
is not unique
to this:
group_id | session_id | amalgamated_field |
---|---|---|
frosted flakes | blue bowl | Bob Brown |
cheerios | green cup | Christine Christopherson |
Where I know the field_label
s that I want to amalgamate, and I want to get the latest value for each amalgamated field label based on sent_at
grouped by group_id
.
So for group frosted flakes, I want to get the most recent field_value
associated with field_label
first_name (Bob) and the most recent field_value
associated with field_label
last_name (Brown).
And repeat.
I tried a cross join and I also tried an inner join similar to this thread. But I keep getting all combinations :/
CodePudding user response:
You may use ROW_NUMBER
function with conditional aggregation as the following:
SELECT group_id, session_id,
CONCAT_WS(' ',
MAX(CASE WHEN field_label='first_name' AND rn=1 THEN field_value END),
MAX(CASE WHEN field_label='last_name' AND rn=1 THEN field_value END)
) AS amalgamated_field
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY group_id, session_id, field_label ORDER BY sent_at DESC) rn
FROM table_name
) T
GROUP BY group_id, session_id
ORDER BY group_id
See a demo.
CodePudding user response:
In Postgres I would recommend distinct on
and string aggregation:
select group_id, session_id,
string_agg(field_value, ' ' order by field_label) full_name
from (
select distinct on (group_id, session_id, field_label) t.*
from mytable t
order by group_id, session_id, field_label, sent_at desc
) t
where field_label in ('first_name', 'last_name')
group by group_id, session_id
The distinct on
subquery returns the latest row for each group/session/label tuple. In the outer query, we filter on the two labels we are interested in, and aggregate the field values for each group/session tuple. The order by
clause of string_agg
ensures that the first and last name are concatenated in sequence.
CodePudding user response:
You can first find the maximum timestamp for each field_label
, and then perform a self-join
of the original table back onto the maximum timestamp, aggregating the field_value
s for each group_id
:
select t1.group_id, t2.session_id, string_agg(t2.field_value,' ')
from (select t.group_id, t.field_label, max(t.sent_at) d
from tbl t group by t.group_id, t.field_label) t1
join tbl t2 on t1.d = t2.sent_at group by t1.group_id, t2.session_id