Home > Enterprise >  Scala groupBy all elements in the item's list
Scala groupBy all elements in the item's list

Time:10-15

I have a list of tuples, where the first element is a string and the second is a list of strings.

For example...(ignoring speech marks)

val p = List((a, List(x,y,z)), (b, List(x)), (c, List(y,z)))

My goal is to group this list into a map with the elements of the nested lists acting as keys.

val q = Map(x -> List(a,b), y -> List(a,c), z-> List(a,c))

My initial thought was to group by the second elements of p but this assigns the entire lists to the keys.

I'm a beginner to Scala so any advice is appreciated. Should I expect to be able to complete this with higher order functions or would for loops be useful here?

Thanks in advance :)

CodePudding user response:

Here are two variants:

val p = List(("a", List("x","y","z")), ("b", List("x")), ("c", List("y","z")))

// 1. "Transducers"
p.flatMap{ case (k, v) => v.map { _ -> k }  }  // List((x,a), (y,a), (z,a), (x,b), (y,c), (z,c))
  .groupBy(_._1) // Map(z -> List((z,a), (z,c)), y -> List((y,a), (y,c)), x -> List((x,a), (x,b)))
  .mapValues(_.map(_._2)) // Map(z -> List(a, c), y -> List(a, c), x -> List(a, b))

// 2. For-loop
var res = Map[String, List[String]]()

for ( (k, vs) <- p;  v <- vs) {
  res  = v -> k :: res.getOrElse(v, List())
}

res  // Map(x -> List(b, a), y -> List(c, a), z -> List(c, a))

// Note, values of `res` are inverted, 
// because the efficient "cons" operator (::) was used to add values to the lists
// you can revert the lists afterwards as this:

res.mapValues(_.reverse) // Map(x -> List(a, b), y -> List(a, c), z -> List(a, c))

Second variant is more performant, because no intermediate collections are created, but it also could be considered "less idiomatic", as mutable variable res is used. However, it's totally fine to use mutable approach inside a private method.


UPD. Per @LuisMiguelMejíaSuárez's suggestions:

In (1), since scala 2.13, groupBy followed by mapValues can be replaced by groupMap, so the whole chain becomes:

p.flatMap{ case (k, v) => v.map { _ -> k }  }
   .groupMap(_._1)(_._2)

Another functional variant without intermediate collections can be achieved using foldLeft:

p.foldLeft(Map[String, List[String]]()) {
  case (acc, (k, vs)) =>
    vs.foldLeft(acc) { (acc1, v) =>
      acc1   (v -> (k :: acc1.getOrElse(v, List())))
    }
}

Or slightly more efficiently with updatedWith (scala 2.13):

p.foldLeft(Map[String, List[String]]()) {
  case (acc, (k, vs)) =>
    vs.foldLeft(acc) { (acc1, v) =>
      acc1.updatedWith(v) {
        case Some(list) => Some(k :: list)
        case None       => Some(List(k))
      }
    }
}

... or same thing slightly shorter:

p.foldLeft(Map[String, List[String]]()) {
  case (acc, (k, vs)) =>
    vs.foldLeft(acc) { (acc1, v) =>
      acc1.updatedWith(v)(_.map(k :: _).orElse(Some(List(k))))
    }
}

Overall, I'd suggest either using foldLeft variant (most performant and functional), or the first, groupMap variant (shorter, and arguably more readable, but less performant), depending on your goals.

CodePudding user response:

Your input list p is one step away from being a Map. From there all you need is a general purpose Map inverter.

import scala.collection.generic.IsIterableOnce
import scala.collection.Factory

// from Map[K,C[V]] to Map[V,C[K]] (Scala 2.13.x)
implicit class MapInverter[K,V,C[_]](m: Map[K,C[V]]) {
  def invert(implicit iio: IsIterableOnce[C[V]] {type A = V}
                    , fac: Factory[K,C[K]]): Map[V,C[K]] =
    m.foldLeft(Map.empty[V, List[K]]) {
      case (acc, (k, vs)) =>
        iio(vs).iterator.foldLeft(acc) {
          case (a, v) =>
            a   (v -> (k::a.getOrElse(v,Nil)))
        }
    }.map{case (k,v) => k -> v.to(fac)}
}

usage:

val p = List(("a", List("x","y","z")), ("b", List("x")), ("c", List("y","z")))
val q = p.toMap.invert
//Map(x -> List(b, a), y -> List(c, a), z -> List(c, a))
  • Related