Home > database >  Determine whether the time range of one table entry overlaps with another KDB /Q
Determine whether the time range of one table entry overlaps with another KDB /Q

Time:03-01

I have a table like the following:

table:([] RIC:`A.N`A.N`A.N`GOOG.O`GOOG.O; 
startRange:2022.01.03D09:31:54.000000000 2022.01.03D09:32:04.000000000 2022.01.03D09:31:54.100000000 2022.01.03D09:31:54.000000000 2022.01.03D09:31:54.100000000;
endRange:2022.01.03D09:31:59.000000000 2022.01.03D09:32:09.000000000 2022.01.03D09:31:59.100000000 2022.01.03D09:31:59.000000000 2022.01.03D09:31:59.100000000)

I want to add a column "overlap" that is a boolean flag that is equal to 1 any time an entry has another entry (with the same corresponding RIC) that overlaps with it's time range. So, for my table above, the first and third entries should be flagged because they're both for `A.N and have overlapping datetime ranges. The fourth and fifth entry should be flagged because they also have the same RIC and overlapping datetime ranges.

I honestly have no idea at all how to even approach this. Any advice would be greatly appreciated!

CodePudding user response:

A bit of a long-winded solution here but I think this covers your use case:

q)raze{update overlap:{any(x within'z)|y within'z}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
RIC    startRange                    endRange                      overlap
--------------------------------------------------------------------------
A.N    2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 1
A.N    2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000 0
A.N    2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 1
GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 1
GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 1

To break this answer down, firstly we needed the time ranges to check for an overlap. We start with all the time ranges for a given RIC:

q)`overlap xcols raze{update overlap:count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap                                                                                                                                                                          ..
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000)                                                        ..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000)                                                        ..

We want to exclude the time range for the entry we're working from:

q)`overlap xcols raze{update overlap:{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap                                                                                                                   RIC    startRange                    endRange          ..
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
(2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) A.N    2022.01.03D09:31:54.000000000 2022.01.03D09:31:5..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) A.N    2022.01.03D09:32:04.000000000 2022.01.03D09:32:0..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000) A.N    2022.01.03D09:31:54.100000000 2022.01.03D09:31:5..
,2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000                                                              GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:5..
,2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000                                                              GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:5..

And finally see if startRange or endRange are within these time ranges:

q)`overlap xcols raze{update overlap:{any(x within'z)|y within'z}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap RIC    startRange                    endRange
--------------------------------------------------------------------------
1       A.N    2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000
0       A.N    2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000
1       A.N    2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000
1       GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000
1       GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000
  • Related