Home > Software engineering >  Read files from a VM from Databricks
Read files from a VM from Databricks

Time:03-01

I want to read files from a VM from databricks. I am able to SFTP to VM from Databricks driver. However I want to read using spark.read. I have tried:

val read_input_df = spark.read
.format("com.springml.spark.sftp")
.option("host", "SFTP_HOST")
.option("username", "username")
.option("password", "password")
.option("fileType", "csv")
.load("/home/file1.csv")

Getting error NoClassDefFoundError: scala/Product$class. Has anyone done this successfully?

CodePudding user response:

The problem is that you're using a library compiled for Scala 2.11 on Databricks cluster runtime that uses Scala 2.12 (7.x/8.x/9.x/10.x). As of right now, there is no released version for Spark 3.x/Scala 2.12, but there is a pull request that you can try to compile yourself & use.

Another approach would be to copy files first via SFTP onto DBFS (for example, like here), and then open as usually.

  • Related