I am currently working with a system with a search engine, and I'm trying to find out if the search result is sorted well so that the users don't have to scroll down to see the result they want.
We have to work with logs that writes down the action of the users such as:
- Search / Search Text
- Results they clicked
and the schema is as following:
user_id | time | action_type | details
-------- --------------------- ------------- --------------
jack | 2022-02-01 15:51:33 | search | query="text1"
sally | 2022-02-01 15:52:00 | search | query="text2"
sally | 2022-02-01 15:52:10 | search | query="text3"
jack | 2022-02-01 15:52:20 | click | target="system1"
sally | 2022-02-01 15:52:30 | click | target="system2"
mike | 2022-02-01 15:53:22 | search | query="text4"
...
What I want to do is to find out "what did they search?" and "what did they click afterwards?", so I want to make a table like below from the table above.
user_id | search_query | click_target
-------- -------------- --------------
jack | text1 | system1
sally | text2 | (null)
sally | text3 | system2
mike | text4 | (null)
How can I accomplish this? My guess is that I have to split the table into 'search' part and 'click' part, then match the search row with the closest click row but before the next search row. However I cannot come up with an query to take account of user_id.
CodePudding user response:
All you need is just lead()over()
analytic function:
Full example on DBFiddle
select
v.user_id
,regexp_substr(details,'"([^"] )"',1,1,null,1) as search_query
,regexp_substr(click ,'"([^"] )"',1,1,null,1) as click_target
from (
select
t.*
,lead(case when action_type='click' then details end)
over(partition by user_id order by time) as click
from your_table t
) v
where action_type='search';
Results:
USER_ID SEARCH_QUERY CLICK_TARGET
---------- ------------ ------------
jack text1 system1
sally text2 (null)
sally text3 system2
mike text4 (null)
Full example with test data:
with your_table(user_id, time, action_type, details) as (
select 'jack' , to_date('2022-02-01 15:51:33','yyyy-mm-dd hh24:mi:ss'), 'search', 'query="text1"' from dual union all
select 'sally', to_date('2022-02-01 15:52:00','yyyy-mm-dd hh24:mi:ss'), 'search', 'query="text2"' from dual union all
select 'sally', to_date('2022-02-01 15:52:10','yyyy-mm-dd hh24:mi:ss'), 'search', 'query="text3"' from dual union all
select 'jack' , to_date('2022-02-01 15:52:20','yyyy-mm-dd hh24:mi:ss'), 'click' , 'target="system1"' from dual union all
select 'sally', to_date('2022-02-01 15:52:30','yyyy-mm-dd hh24:mi:ss'), 'click' , 'target="system2"' from dual union all
select 'mike' , to_date('2022-02-01 15:53:22','yyyy-mm-dd hh24:mi:ss'), 'search', 'query="text4"' from dual
)
select
v.user_id
,regexp_substr(details,'"([^"] )"',1,1,null,1) as search_query
,regexp_substr(click ,'"([^"] )"',1,1,null,1) as click_target
from (
select
t.*
,lead(case when action_type='click' then details end)
over(partition by user_id order by time) as click
from your_table t
) v
where action_type='search'
order by time;