Home > Software engineering >  Scala: provide class type in parameter
Scala: provide class type in parameter

Time:10-23

I have one method that takes class as parameter like below.

    val hBaseRDD = spark.sparkContext.newAPIHadoopFile(path,
      classOf[org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat[ImmutableBytesWritable, Result]],
      classOf[ImmutableBytesWritable],
      classOf[Result], conf)

I want to write a method that takes parameter as Type of class and than I can call this line inside it. like below.

case class SequenceInput(conf: Configuration,
                         path: String,
                         storageClass: String,
                         keyClass: String,
                         valueClass: String,
                      ){
  override def read(sparkSession: SparkSession): DataFrame = {
    val rdd = sparkSession.sparkContext.newAPIHadoopFile(path,
      classOf[storageClass],
      classOf[keyClass],
      classOf[valueClass], conf)
    rdd
  }

but this ask me to create storaClass, keyClass, valueClass but these are the variable that holds the class type.

How to do this?

CodePudding user response:

You're writing a constructor, not a method, but change

storageClass: String,
keyClass: String,
valueClass: String

To be Class objects, not Strings

Then your function can

return sparkSession.sparkContext.newAPIHadoopFile(path,
      storageClass,
      keyClass
      valueClass, conf)

Then

val storageClass = Class.forName(config.get("storage_class"))
...
// remove path from the constructor since you should be able to use multiple paths 
val df = SequenceInput(storageClass,...).read(spark, path)

Keep in mind that, Class.forName expects the fully qualified name, not just "ImmutableBytesWritable", for example

CodePudding user response:

If I understand correctly, you need to convert a String into Class. You can do this with Class.forName(String)

case class SequenceInput(conf: Configuration,
                         path: String,
                         storageClass: String,
                         keyClass: String,
                         valueClass: String,
                        ) {
  override def read(sparkSession: SparkSession): DataFrame = {
    val rdd = sparkSession.sparkContext.newAPIHadoopFile(path,
      Class.forName(storageClass),
      Class.forName(keyClass),
      Class.forName(valueClass), conf)
    rdd
  }
}
  • Related