I have a SQL table:
id | phoneNo | transactionDateTime | transactionDate |
---|---|---|---|
1 | 123456 | 2022-03-15 11:46:51:00 | 2022-03-15 |
2 | 123456 | 2022-03-15 11:49:52:00 | 2022-03-15 |
3 | 123456 | 2022-03-15 11:52:52:00 | 2022-03-15 |
4 | 123456 | 2022-03-15 12:53:00:00 | 2022-03-15 |
5 | 123456 | 2022-03-15 11:49:52:00 | 2022-03-15 |
6 | 234567 | 2022-03-15 11:49:52:00 | 2022-03-15 |
7 | 546345 | 2022-03-15 11:49:52:00 | 2022-03-15 |
8 | 874655 | 2022-03-15 11:49:52:00 | 2022-03-15 |
I need to get all the entries where more than 4 transactions have happened for the same phoneNo
and the transactionDateTime
's are within a 2 hour span, so in the case above, id 1 to 5 have the same phoneNo and are within the 2 hour span, so the result would be
id | phoneNo | transactionDateTime | transactionDate |
---|---|---|---|
1 | 123456 | 2022-03-15 11:46:51:00 | 2022-03-15 |
2 | 123456 | 2022-03-15 11:49:52:00 | 2022-03-15 |
3 | 123456 | 2022-03-15 11:52:52:00 | 2022-03-15 |
4 | 123456 | 2022-03-15 12:53:00:00 | 2022-03-15 |
5 | 123456 | 2022-03-15 11:49:52:00 | 2022-03-15 |
So far, I have been able to query :
SELECT * FROM txnTable
WHERE phoneNo IN
(SELECT phoneNo
FROM txnTable
GROUP BY phoneNo, transactionDate
HAVING COUNT(phoneNo) > 4 )
which only groups all entries by date and not the 2 hour timespan. Would appreciate any help on how to write a query in this particular scenario.
CodePudding user response:
You can use SQL Server floor
function to group your transactionDateTime
every 2 hours.
SELECT * FROM txnTable
WHERE phoneNo IN
(SELECT phoneNo
FROM txnTable
GROUP BY phoneNo, transactionDate, floor(datepart(hh, transactionDateTime)/2
HAVING COUNT(phoneNo) > 4 and floor(datepart(hh, transactionDateTime)/2) = 2)
CodePudding user response:
Choose every row from the table, as long as the number of records with the same phoneno and within 2 hours each way is 4
SELECT *
FROM txnTable t1
WHERE
4= (SELECT count(*)
FROM txnTable t2
WHERE t2.phoneNo=t1.phoneNo
AND abs(datediff(second,
t1.transactionDateTime,
t2.transactionDateTime
)
) <=2*3600
)
ORDER BY phoneNo, transactionDateTime
CodePudding user response:
You can calculate the hour difference between 2 rows which is inside a partitioned bucket and arrive at the max of differences within that bucket. Once that is deduced, it can be filtered using the "2 hours" criteria as below.
declare @tbl table(id int identity(1,1)
,phoneNo int,transactionDateTime datetime,transactionDate date
)
insert into @tbl(phoneNo,transactionDateTime,transactionDate)
values
('123456','2022-03-15 11:46:51:00','2022-03-15')
,('123456','2022-03-15 11:49:52:00','2022-03-15')
,('123456','2022-03-15 11:52:52:00','2022-03-15')
,('123456','2022-03-15 12:53:00:00','2022-03-15')
,('123456','2022-03-15 11:49:52:00','2022-03-15')
,('234567','2022-03-15 11:49:52:00','2022-03-15')
,('546345','2022-03-15 11:49:52:00','2022-03-15')
,('874655','2022-03-15 11:49:52:00','2022-03-15')
select id,phoneNo,transactionDateTime,transactionDate
from(
select *,max(hourdiff)over(partition by phoneNo order by transactionDateTime
rows between unbounded preceding and unbounded following
) as [maxhours] from(
select *
,DATEDIFF(hour, transactionDateTime , max(transactionDateTime)over(partition by phoneNo
order by transactionDateTime rows between current row and 1 following
)) AS hourdiff
,ROW_NUMBER()over(partition by phoneNo order by phoneNo) rownum
,count(1)over(partition by phoneNo order by phoneNo) [count]
from
@tbl
)t)t
where t.[count] > 4 and
t.maxhours <= 2
and t.maxhours <>0
order by id