Home > Blockchain >  Using Scala ClassTags in Collections
Using Scala ClassTags in Collections

Time:07-15

I'm trying to create an enum-like type in Scala with a generic type, later doing operations on the instances of the type that depend on what the generic is using Scala's reflect.ClassTag to get information on the generic type. Something like this:

import scala.reflect.ClassTag

sealed trait X[T : ClassTag] {
  def value: T
}

case object I1 extends X[Int] { override def value = 1 }
case object I2 extends X[Int] { override def value = 2 }
case object Sa extends X[String] { override def value = "a" }
case object Sb extends X[String] { override def value = "b" }

val values = IndexedSeq(I1, I2, Sa, Sb)

values.foreach{
  case i: X[Int] => println(s"${i.value} => ${i.value   1}")
  case s: X[String] => println(s"${s.value} => ${s.value.toUpperCase}")
}

This produces the following warnings:

the type test for Playground.X[Int] cannot be checked at runtime
the type test for Playground.X[String] cannot be checked at runtime

For completeness, when run, it produces the following output (which is reasonable given the warnings):

1 => 2
2 => 3
java.lang.ExceptionInInitializerError
    at Main$.<clinit>(main.scala:24)
    at Main.main(main.scala)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at sbt.Run.invokeMain(Run.scala:143)
    at sbt.Run.execute$1(Run.scala:93)
    at sbt.Run.$anonfun$runWithLoader$5(Run.scala:120)
    at sbt.Run$.executeSuccess(Run.scala:186)
    at sbt.Run.runWithLoader(Run.scala:120)
    at sbt.Run.run(Run.scala:127)
    at com.olegych.scastie.sbtscastie.SbtScastiePlugin$$anon$1.$anonfun$run$1(SbtScastiePlugin.scala:38)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at sbt.util.InterfaceUtil$$anon$1.get(InterfaceUtil.scala:17)
    at sbt.ScastieTrapExit$App.run(ScastieTrapExit.scala:259)
    at java.base/java.lang.Thread.run(Thread.java:831)
Caused by: java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Integer (java.lang.String and java.lang.Integer are in module java.base of loader 'bootstrap')
    at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:99)
    at Playground$.$anonfun$1(main.scala:17)
    at scala.runtime.function.JProcedure1.apply(JProcedure1.java:15)
    at scala.runtime.function.JProcedure1.apply(JProcedure1.java:10)
    at scala.collection.immutable.Vector.foreach(Vector.scala:1856)
    at Playground$.<clinit>(main.scala:20)
    ... 17 more

I also tried doing it like this, with the singleton objects implemented as instances of case classes instead:

import scala.reflect.ClassTag

sealed trait X[T : ClassTag] {
  def value: T
}

case class I(value: Int) extends X[Int]
case class S(value: String) extends X[String]

val values = IndexedSeq(I(1), I(2), S("a"), S("b"))

values.foreach{
  case i: X[Int] => println(s"${i.value} => ${i.value   1}")
  case s: X[String] => println(s"${s.value} => ${s.value.toUpperCase}")
}

But I get pretty much the exact same result.

When I do something that appears to me to be similar using a Scala Array, it works:

val values = IndexedSeq(
  Array(1, 2),
  Array(3, 4),
  Array("a", "b"),
  Array("c", "d")
)
values.foreach{
  case i: Array[Int] => println(s"""${i.mkString(",")} => ${i.map(_ * 2).mkString(",")}""")
  case s: Array[String] => println(s"""${s.mkString(",")} => ${s.map(_.toUpperCase).mkString(",")}""")
}

This produces no warnings and the correct output:

1,2 => 2,4
3,4 => 6,8
a,b => A,B
c,d => C,D

What am I doing wrong here? I thought ClassTag was supposed to preserve information about a generic type during runtime? I've seen that reflect.runtime.universe.TypeTag might be better, but the package containing that doesn't seem to be available in Scala 3, and somehow Array is able to do what I want anyway.

CodePudding user response:

From the documentation, ClassTag:

A ClassTag[T] stores the erased class of a given type T, accessible via the runtimeClass field.

You do not avoid the type erasure of T. T still gets erased. What ClassTag is doing is simply store the erased type of T inside an object of type ClassTag[T] (implicitly added via the context bound) but implicit parameters have scope only in the body of the method or class they are defined, so you can only use it inside the body of the trait because that is where you defined it.

Trait X still does not know it's type parameter T at runtime, but you can access the actual type T inside the body of the trait, because you have a reference to it in the context bound. But for pattern matching outside of the trait, you don't need a ClassTag context bound in the trait at all.

So by doing:

sealed trait X[T : ClassTag]

Which is equivalent to:

sealed trait X[T](implicit cls: ClassTag[T])

You are assuming that pattern match will behave differently. But that is not the case. Pattern matching is still the same: it does not type check generic types, because they get erased at compile-time. That is why you get the warnings. So the first case line used:

case i: X[Int]

is equivalent to this:

case i: X[_]

Actually, both your cases are equivalent to this. But the second becomes dead code, as the first will always match. As such, all your elements of values are matched on the first case and when ${i.value 1} is reached, this will work for the first 2 elements, but it will throw a ClassCastException on elements 3 and 4 because it will try to do integer addition by converting a String to an Integer to add 1 to it.

The closest you can get with the use ClassTag is to move the pattern match into a polymorphic method and pass the ClassTag value there via an implicit parameter, or more conveniently using a context bound:

import scala.reflect.ClassTag

sealed trait X[T] {
  def value: T
}

case object I1 extends X[Int] { override def value = 1 }
case object I2 extends X[Int] { override def value = 2 }
case object Sa extends X[String] { override def value = "a" }
case object Sb extends X[String] { override def value = "b" }

val values = IndexedSeq(I1, I2, Sa, Sb)

def extract[T : ClassTag] = 
  values.flatMap { x => 
    x.value match {
      case y: T => Some(y)
      case _    => None
    } 
  }

val result = extract[Int]
val result2 = extract[String]

println(result)    // Vector(1, 2)
println(result2)   // Vector(a, b)

This actually gives the illusion that it does not erase the generic type T at runtime, but it does. If you decompile the extract method, you would see this:

public <T> IndexedSeq<T> extract(final ClassTag<T> evidence$1) {
    return (IndexedSeq<T>)this.values().flatMap(x -> {
        final Object value = ((MainScala.X)x).value();
        if (value != null) {
            final Option unapply = evidence$1.unapply(value);
            if (!unapply.isEmpty() && unapply.get() instanceof Object) {
                final Object module$ = new Some(value);
                return (Option)module$;
            }
        }
        final Object module$ = None$.MODULE$;
        return (Option)module$;
    });
}

As you can see the instanceof Object check means type T has been erased. But the line before it is the one we are more interested in:

final Option unapply = evidence$1.unapply(value);

This is where ClassTag comes into play and if we check the method unapply of ClassTag.scala, we see what it does:

  /** A ClassTag[T] can serve as an extractor that matches only objects of type T.
   *
   * The compiler tries to turn unchecked type tests in pattern matches into checked ones
   * by wrapping a `(_: T)` type pattern as `ct(_: T)`, where `ct` is the `ClassTag[T]` instance.
   * Type tests necessary before calling other extractors are treated similarly.
   * `SomeExtractor(...)` is turned into `ct(SomeExtractor(...))` if `T` in `SomeExtractor.unapply(x: T)`
   * is uncheckable, but we have an instance of `ClassTag[T]`.
   */
  def unapply(x: Any): Option[T] =
    if (runtimeClass.isInstance(x)) Some(x.asInstanceOf[T])
    else None

By checking this out, we see the runtimeClass field as per the docs stated. So ClassTag is able to restore the type parameter T to the actual type argument of the method that was called with, so T becomes Int for extract[Int], String for extract[String] and so on.

Regarding your other questions: Making them case classes does not change anything because you are still pattern matching on generic types, which have their type parameters erased.

Changing the pattern match to use Arrays works because Arrays are not a generic type in Scala (nor in Java). Actually, Scala Arrays are implemented as Java arrays, and the JVM preserves their type because arrays existed before generics were even introduced in Java 5, so they never had their type erased to begin with.

This example is just for understanding the concept. You should not use reflection if you don't need it, and in this particular use-case you don't. There are multiple alternatives: matching directly the case classes, as AminMal mentioned, or matching the value fields directly, etc:

values.foreach{ _.value match {
    case s: String => println(s"${s} => ${s.toUpperCase}")
    case i: Int    => println(s"${i} => ${i   1}")
  } 
}
  • Related