Suppose I have the below dataset. I would like to filter out the session_id
(along with user_id
) which has product
event_type but not the purchase
event_type. I feel I know the answer but just can't put a finger on it. Your help would be really appreciated. Thanks in advance.
user_id session_id event_type
59378 00000e74-6a4d-495d-aab5-2ac51165b34a cart
59378 00000e74-6a4d-495d-aab5-2ac51165b34a department
59378 00000e74-6a4d-495d-aab5-2ac51165b34a purchase
59378 00000e74-6a4d-495d-aab5-2ac51165b34a product
43155 00001e3b-f089-4907-8f8a-09951de4de27 product
43155 00001e3b-f089-4907-8f8a-09951de4de27 cart
43155 00001e3b-f089-4907-8f8a-09951de4de27 department
43155 00001e3b-f089-4907-8f8a-09951de4de27 home
66131 0000bc7a-b34f-484f-a534-e9ffaacbb0c7 department
66131 0000bc7a-b34f-484f-a534-e9ffaacbb0c7 cart
66131 0000bc7a-b34f-484f-a534-e9ffaacbb0c7 product
66131 0000bc7a-b34f-484f-a534-e9ffaacbb0c7 home
66131 0000bc7a-b34f-484f-a534-e9ffaacbb0c7 purchase
83743 0001f1a2-d817-4639-8f4b-1949a7482af4 department
83743 0001f1a2-d817-4639-8f4b-1949a7482af4 product
83743 0001f1a2-d817-4639-8f4b-1949a7482af4 cart
3493 0002d27b-eebb-4715-aac6-9d9f938e52a8 cart
3493 0002d27b-eebb-4715-aac6-9d9f938e52a8 department
3493 0002d27b-eebb-4715-aac6-9d9f938e52a8 product
3493 0002d27b-eebb-4715-aac6-9d9f938e52a8 purchase
7068 00033390-0e62-4540-b2d2-8fc2085f74ef department
7068 00033390-0e62-4540-b2d2-8fc2085f74ef product
7068 00033390-0e62-4540-b2d2-8fc2085f74ef cart
21834 00035b69-ed84-499e-bc0f-89ecfea2a731 product
21834 00035b69-ed84-499e-bc0f-89ecfea2a731 cart
21834 00035b69-ed84-499e-bc0f-89ecfea2a731 department
Desired result:
user_id session_id
43155 00001e3b-f089-4907-8f8a-09951de4de27
83743 0001f1a2-d817-4639-8f4b-1949a7482af4
7068 00033390-0e62-4540-b2d2-8fc2085f74ef
21834 00035b69-ed84-499e-bc0f-89ecfea2a731
CodePudding user response:
Try the following:
Select A.user_id, A.session_id
From Dataset A
Where A.event_type='product'
And Not Exists (Select 1 From Dataset where user_id=A.user_id And event_type='purchase')
CodePudding user response:
Use aggregation and the conditions in the HAVING
clause:
SELECT user_id, session_id
FROM tablename
WHERE event_type IN ('product', 'purchase')
GROUP BY user_id, session_id
HAVING COUNTIF(event_type = 'purchase') = 0;
Or:
SELECT user_id, session_id
FROM tablename
GROUP BY user_id, session_id
HAVING COUNTIF(event_type = 'product') > 0 AND COUNTIF(event_type = 'purchase') = 0;