I've been trying to learn Spark & Scala, and have an environment setup in IntelliJ.
I'd previously been using SparkContext to initialise my Spark instance successfully, using the following code:
import org.apache.spark._
val sc = new SparkContext("local[*]", "SparkTest")
When I tried to start loading .csv data in, most information I found used spark.read.format("csv").load("filename.csv")
but this requires initialising a SparkSession
object using:
val spark = SparkSession
.master("local")
.builder()
.appName("Test")
.getOrCreate()
But when I tried to use this, there doesn't seem to be any SparkSession
in org.apache.spark._
in my version of Spark 3.x.
As far as I'm aware, the use of SparkContext
is the Spark 1.x method, and SparkSession
is Spark 2.x where spark.sql
is built-in to the SparkSession object.
My question is whether I'm incorrectly trying to load SparkSession
or if there's a separate way to approach initialising Spark (and loading .csv files) in Spark 3?
Spark version: 3.3.0
Scala version: 2.13.8
CodePudding user response:
If you are using Maven type project then try adding dependencies to the POM file. Otherwise, for the sake of troubleshooting, create a new Maven type project, add dependencies and check whether you are still having same issue.