I would like to add a where statement to only return records where a transactions_id's in transaction_id_2022
are not in transaction_id_2021
column.
Sample table:
customer_id transaction_id_2021 transaction_id_2022
382 ["81-a-39-b","22-t-42-f"] ["81-a-39-b","22-t-42-f","93-g-64-t"]
742 ["53-y-11-g","43-t-55-a"] ["53-y-11-g","43-t-55-a"]
Desired output table
customer_id transaction_id_2021 transaction_id_2022
382 ["81-a-39-b","22-t-42-f","29-f-31-i"] ["81-a-39-b","22-t-42-f","93-g-64-t"]
CodePudding user response:
the format should look like this:
SELECT col1
FROM your_table
WHERE col1 NOT IN (SELECT col2 FROM your_table)
CodePudding user response:
Combination of array_except
and cardinality
should do the trick:
array_except(x, y)
→array
Returns an array of elements inx
but not iny
, without duplicates.
cardinality(x)
→bigint
Returns the cardinality (size) of the arrayx
.
-- sample data
WITH dataset (customer_id, transaction_id_2021, transaction_id_2022) AS (
VALUES (382, array['81-a-39-b','22-t-42-f'], array['81-a-39-b','22-t-42-f','93-g-64-t']),
(742, array['53-y-11-g','43-t-55-a'], array['53-y-11-g','43-t-55-a'])
)
-- query
select *
from dataset
where cardinality(array_except(transaction_id_2022, transaction_id_2021)) > 0
Output:
customer_id | transaction_id_2021 | transaction_id_2022 |
---|---|---|
382 | [81-a-39-b, 22-t-42-f] | [81-a-39-b, 22-t-42-f, 93-g-64-t] |