Home > database >  Speed up hashing of each element in an Iterator
Speed up hashing of each element in an Iterator

Time:03-17

I have an iterator of type Iterator[String] with a size of 2501235 elements. I also have a list of 100 hash functions and I want to hash each element in the Iterator with all the hash functions. The code below is what I have so far but it is taking a very long time to run, is there a way I can fix the code to make it run faster?

def hashing_item(value: (Int,List[List[Int]],List[Hash_Function]), item: String): (Int,List[List[Int]],List[Hash_Function])= {
      val (bits,res,elems) = (value._1,value._2,value._3)
      val hashed_input = res    List(elems.map(func => func.apply(item) % bits))
      (bits,hashed_input,elems)
    }

val tempList: List[List[Int]] = List()
val hashing_elems = s.foldLeft(bits,tempList,hashes)(hashing_item)

CodePudding user response:

If I am understanding the code correctly, you only need a flatMap instead of that foldLeft

iterator.flatMap { item =>
  hashes.iterator.map { func =>
    func.apply(item) % bits
  }
}

This will return another Iterator which won't compute nothing until needed.

  • Related