I want to read files from a VM from databricks. I am able to SFTP to VM from Databricks driver. However I want to read using spark.read. I have tried:
val read_input_df = spark.read
.format("com.springml.spark.sftp")
.option("host", "SFTP_HOST")
.option("username", "username")
.option("password", "password")
.option("fileType", "csv")
.load("/home/file1.csv")
Getting error NoClassDefFoundError: scala/Product$class. Has anyone done this successfully?
CodePudding user response:
The problem is that you're using a library compiled for Scala 2.11 on Databricks cluster runtime that uses Scala 2.12 (7.x/8.x/9.x/10.x). As of right now, there is no released version for Spark 3.x/Scala 2.12, but there is a pull request that you can try to compile yourself & use.
Another approach would be to copy files first via SFTP onto DBFS (for example, like here), and then open as usually.