Home > OS >  How to initialise SparkSession in Spark 3.x
How to initialise SparkSession in Spark 3.x

Time:06-24

I've been trying to learn Spark & Scala, and have an environment setup in IntelliJ.

I'd previously been using SparkContext to initialise my Spark instance successfully, using the following code:

import org.apache.spark._    
val sc = new SparkContext("local[*]", "SparkTest")

When I tried to start loading .csv data in, most information I found used spark.read.format("csv").load("filename.csv") but this requires initialising a SparkSession object using:

val spark = SparkSession
  .master("local")
  .builder()
  .appName("Test")
  .getOrCreate()

But when I tried to use this, there doesn't seem to be any SparkSession in org.apache.spark._ in my version of Spark 3.x.

As far as I'm aware, the use of SparkContext is the Spark 1.x method, and SparkSession is Spark 2.x where spark.sql is built-in to the SparkSession object.

My question is whether I'm incorrectly trying to load SparkSession or if there's a separate way to approach initialising Spark (and loading .csv files) in Spark 3?

Spark version: 3.3.0

Scala version: 2.13.8

CodePudding user response:

If you are using Maven type project then try adding dependencies to the POM file. Otherwise, for the sake of troubleshooting, create a new Maven type project, add dependencies and check whether you are still having same issue.

  • Related