Home > Software design >  How to filter values of a column based on the values of another column considering first column is g
How to filter values of a column based on the values of another column considering first column is g

Time:07-25

Suppose I have the below dataset. I would like to filter out the session_id (along with user_id) which has product event_type but not the purchase event_type. I feel I know the answer but just can't put a finger on it. Your help would be really appreciated. Thanks in advance.

user_id session_id  event_type
59378   00000e74-6a4d-495d-aab5-2ac51165b34a    cart
59378   00000e74-6a4d-495d-aab5-2ac51165b34a    department
59378   00000e74-6a4d-495d-aab5-2ac51165b34a    purchase
59378   00000e74-6a4d-495d-aab5-2ac51165b34a    product
43155   00001e3b-f089-4907-8f8a-09951de4de27    product
43155   00001e3b-f089-4907-8f8a-09951de4de27    cart
43155   00001e3b-f089-4907-8f8a-09951de4de27    department
43155   00001e3b-f089-4907-8f8a-09951de4de27    home
66131   0000bc7a-b34f-484f-a534-e9ffaacbb0c7    department
66131   0000bc7a-b34f-484f-a534-e9ffaacbb0c7    cart
66131   0000bc7a-b34f-484f-a534-e9ffaacbb0c7    product
66131   0000bc7a-b34f-484f-a534-e9ffaacbb0c7    home
66131   0000bc7a-b34f-484f-a534-e9ffaacbb0c7    purchase
83743   0001f1a2-d817-4639-8f4b-1949a7482af4    department
83743   0001f1a2-d817-4639-8f4b-1949a7482af4    product
83743   0001f1a2-d817-4639-8f4b-1949a7482af4    cart
3493    0002d27b-eebb-4715-aac6-9d9f938e52a8    cart
3493    0002d27b-eebb-4715-aac6-9d9f938e52a8    department
3493    0002d27b-eebb-4715-aac6-9d9f938e52a8    product
3493    0002d27b-eebb-4715-aac6-9d9f938e52a8    purchase
7068    00033390-0e62-4540-b2d2-8fc2085f74ef    department
7068    00033390-0e62-4540-b2d2-8fc2085f74ef    product
7068    00033390-0e62-4540-b2d2-8fc2085f74ef    cart
21834   00035b69-ed84-499e-bc0f-89ecfea2a731    product
21834   00035b69-ed84-499e-bc0f-89ecfea2a731    cart
21834   00035b69-ed84-499e-bc0f-89ecfea2a731    department

Desired result:

user_id session_id 
43155   00001e3b-f089-4907-8f8a-09951de4de27
83743   0001f1a2-d817-4639-8f4b-1949a7482af4
7068    00033390-0e62-4540-b2d2-8fc2085f74ef
21834   00035b69-ed84-499e-bc0f-89ecfea2a731

CodePudding user response:

Try the following:

Select A.user_id, A.session_id
From Dataset A 
Where A.event_type='product'
And Not Exists (Select 1 From Dataset where user_id=A.user_id And event_type='purchase')

See a demo using MySQL from enter image description here

CodePudding user response:

Use aggregation and the conditions in the HAVING clause:

SELECT user_id, session_id
FROM tablename
WHERE event_type IN ('product', 'purchase')
GROUP BY user_id, session_id
HAVING COUNTIF(event_type = 'purchase') = 0;

Or:

SELECT user_id, session_id
FROM tablename
GROUP BY user_id, session_id
HAVING COUNTIF(event_type = 'product') > 0 AND COUNTIF(event_type = 'purchase') = 0;
  • Related