URL TABLE:
| URL PATH |
| www.abc.com/123 |
| www.abc.com/234 |
| www.abc.com/435 |
TRAFFIC TABLE:
| ID | URL PATH | COUNTRY | TIMESTAMP |
| 1 | www.abc.com/123 | US | 3/31/2022 10:00 |
| 1 | www.abc.com/gb/123 | GB | 3/31/2022 10:00 |
| 1 | www.abc.com/de/123 | DE | 3/31/2022 10:00 |
| 4 | www.abc.om/xyz | US | 3/31/2022 10:00 |
| 4 | www.abc.om/asd | DE | 3/31/2022 10:00 |
OBJECTIVE:
Hello, I have two tables. 1) URL Table contains a list of URLs specific to US Country and 2) Traffic Table containing all event timestamps for all countries on the website.
I want to first get the product ID that matches the URL PATH in the URL TABLE then get the aggregate count of that product ID. With the above example, the desired outcome would be:
| ID | COUNT |
| 1 | 3 |
ID 4 wouldn't appear because none of it's corresponding URL appears in the URL table.
www.abc.com/123 matches ID 1 in the traffic table and there are three rows associated with this ID hence a value of 3.
I've tried doing some LEFT JOINs but haven't had much luck.New to SQL. Would appreciate any help! Also using the Snowflake UI to create this query.
CodePudding user response:
SELECT ID, COUNT(ID) as COUNT FROM TRAFFICTABLE WHERE URLPATH in (SELECT URLPATH FROM URLTABLE) GROUP BY ID
can you try this?
CodePudding user response:
I assume that the URLs are matched based on the first and last part of the URLs?
Setup sample tables
-- Create sample URL_TABLE table
create or replace table url_table as
select COLUMN1::varchar as url_path
from
values ('www.abc.com/123'),
('www.abc.com/234'),
('www.abc.com/435')
;
-- Create sample TRAFFIC_TABLE table
create or replace table traffic_table as
select COLUMN1::number as id,
COLUMN2::varchar as path,
COLUMN3::varchar as country,
to_timestamp(COLUMN4::varchar, 'MM/DD/YYYY HH:MI') as timestamp
from
values ('1', 'www.abc.com/123', 'US', '3/31/2022 10:00'),
('1', 'www.abc.com/gb/123', 'GB', '3/31/2022 10:00'),
('1', 'www.abc.com/de/123', 'DE', '3/31/2022 10:00'),
('4', 'www.abc.om/xyz', 'US', '3/31/2022 10:00'),
('4', 'www.abc.om/asd', 'DE', '3/31/2022 10:00')
;
SQL Query
-- Run select:
with _traffic_table as (
select id,
split(path, '/') as arr,
arr[0] || '/' || arr[array_size(arr) - 1] new_path
from traffic_table
)
select _traffic_table.id,
count(1)
from url_table
join _traffic_table on url_table.url_path = _traffic_table.new_path
group by _traffic_table.id
Result:
-- --------
|ID|COUNT(1)|
-- --------
|1 |3 |
-- --------