Write a query that returns all urls involved in a "bounce." A url X is involved in a bounce if a single user navigates from a page Y to page X, then immediately returns to page Y while visiting no other pages in between. That is, you'll want to find two visits X->Y (on date d1) and Y->X (on date d2) for the same user u, for which there are no other visits by u between d1 and d2.
You can compare dates just like numbers: d1 < d2 is true if date d1 is prior to date d2.
I self joined this table:
url | dt | uid | src | rev |
---|---|---|---|---|
A04 | 1890-05-14 | A | A01 | 10 |
A02 | 1890-05-15 | A | A04 | 15 |
A01 | 1890-05-16 | A | A04 | 20 |
This is my query:
Select *
FROM Visit V1, Visit V2
WHERE V1.url = V2.src
AND V1.src = V2.url
AND V1.dt < V2.dt
AND V1.uid = V2.uid
I understand that the single user and navigation part, and comparing the dates part.
A01 would not work here because the single user A nagivated from A01 -> A04 -> A01, however, the user visited A02 in between so it would return an empty set. How would I make it so that it return the urls only when the single user visits no other pages while navigating back and forth?
CodePudding user response:
You can add another condition in your where clause as below:
AND NOT EXISTS (SELECT * FROM Visit visit WHERE visit.uid=V1.uid and visit.dt>V1.dt AND visit.dt<V2.dt)