Home > OS >  Read CSV file on Spark
Read CSV file on Spark

Time:02-25

I am started working with Spark and find some problem. When I tryed read CSV file

df = spark.read.csv("/home/oybek/Serverspace/Serverspace/Athletes.csv")
df.show(5)

I got erorr:

Py4JJavaError: An error occurred while calling o38.csv.
: java.lang.OutOfMemoryError: Java heap space

I am worki in Linux Ubuntu, VirtualBox:~/Serverspace .

P.S. Sorry for my english grammer =)

CodePudding user response:

You can try changing the driver memory by creating a spark session variable like below:

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .master('local[*]') \
    .config("spark.driver.memory", "4g") \
    .appName('read-csv') \
    .getOrCreate()
  • Related