I want to filter a map of type [String, Seq[Int]] by its value (the Seq). As an example:
val data: Map[String, Seq[Int]] = Map("a" -> Seq(1,2,3,4,5,6), "b" -> Seq(2,3,4,5,6,7), "c" -> Seq(3,4,5,6,7,8), "d"->Seq(1))
for this data, in case I need to get specific values from Seq (if present) and no entry in map if not present, I should be able to get that. If I am looking for subarray (4,5) in the values of the map, the output should be
Map("a" -> Seq(4,5), "b" -> Seq(4,5), "c" -> Seq(4,5))
I tried to use the following lines of code, but they don't work as expected:
val filtered_data_1 = data.filter( { case (_, value) => (value.filter(v => v.equals((4,5)))).length > 0 })
val filtered_data_2 = data.filter(d => (d._2.filter(v => v.equals((4,5)))).length > 0)
Please let me know what is the issue with my code and how to correct it. Thank you!
CodePudding user response:
You were close ,this can probably be optimized in a single map function , but this gives you the idea
scala> val data: Map[String, Seq[Int]] = Map("a" -> Seq(1,2,3,4,5,6), "b" -> Seq(2,3,4,5,6,7), "c" -> Seq(3,4,5,6,7,8), "d"->Seq(1))
data: Map[String,Seq[Int]] = Map(a -> List(1, 2, 3, 4, 5, 6), b -> List(2, 3, 4, 5, 6, 7), c -> List(3, 4, 5, 6, 7, 8), d -> List(1))
val subSeq = Seq(4,5)
subSeq: Seq[Int] = List(4, 5)
val filtered_data_1 = data.filter( { case (_, value) => subSeq.forall(value.contains)}).map( { case (k, v) => (k,subSeq)}).toMap
filtered_data_1: scala.collection.immutable.Map[String,Seq[Int]] = Map(a -> List(4, 5), b -> List(4, 5), c -> List(4, 5))
CodePudding user response:
As the other answer mentioned, there is a shorter/more convenient approach to this. Most of the times in scala collections when you do:
myCollection.filter(p).map(f) // chaining filter and map
It's better to use collect in this cases, the code will be more readable/easier to follow, and depending on the implementation of collections, execution time would be less than or equal to the first approach (meaning they're equal).
myCollection.collect { case elem if p(elem) => f(elem) }
So in your case:
data.collect {
case (key, value) if subSeq forall value.contains =>
key -> subSeq
}