Home > Software design >  Find the indices at which two arrays intersect
Find the indices at which two arrays intersect

Time:04-20

I'm wondering if in Scala there's a decent way to get the indices at which two arrays intersect.

So given arrays:

  • a1 = [0, 5, 10, 15, 20, 25, 30]
  • a2 = [10, 20, 30, 40, 50]

Ideally, taking advantage of the fact that both arrays are ordered and contain no duplicates.

These share common elements a1.intersect(a2) = [10, 20, and 30]. The index (position) at which these elements occur is different for each array.

I would like to produce a sequence of tuples with the positions from each list where they intersect:

  • intersectingIndices(a1, a2) = [(2, 0), (4, 1), (6, 2)]

While intersect gives the intersecting values, I need to know the original indices and would prefer not to have to do an O(N) scan to find each one - as these arrays get very long (millions of elements). I also suspect the complexity of intersect is unnecessarily large given both arrays will always be sorted in advance, so a single-pass option would be preferable.

CodePudding user response:

If both lists are storted, it seems fairly straightforward, just a slight variation of the "merge" phase of merge-sort.

@taiilrec
def intersect(
   left: List[Int], 
   right: List[Int], 
   lidx: Int = 0, 
   ridx: Int = 0,
   result: List[(Int, Int)] = Nil
): List[(Int, Int)] = (left, right) match { 
    case (Nil, _) | (_, Nil) => result.reverse
    case (l::tail, r::_) if l < r => intersect(tail, right, lidx 1, ridx, result)
    case (l::_, r::tail) if l > r => intersect(left, tail, lidx, ridx 1,  result)
    case (l::ltail, r::rtail) => intersect(ltail, rtail, lidx 1, ridx 1, (lidx, ridx) :: result)
}
   

Or just hash one of the lists, and then scan the other (it is still O(n), albeit somewhat more expensive, but much simpler):

val hashed = left.zipWithIndex.toMap
right.zipWithIndex.flatMap { case(x, idx) => hashed.get(x).map(idx -> _) }
  • Related