I am started working with Spark and find some problem. When I tryed read CSV file
df = spark.read.csv("/home/oybek/Serverspace/Serverspace/Athletes.csv")
df.show(5)
I got erorr:
Py4JJavaError: An error occurred while calling o38.csv.
: java.lang.OutOfMemoryError: Java heap space
I am worki in Linux Ubuntu, VirtualBox:~/Serverspace .
P.S. Sorry for my english grammer =)
CodePudding user response:
You can try changing the driver
memory by creating a spark session
variable like below:
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.master('local[*]') \
.config("spark.driver.memory", "4g") \
.appName('read-csv') \
.getOrCreate()