Home > Net >  Is there a way to read data without SQL in Spark?
Is there a way to read data without SQL in Spark?

Time:06-17

I am beginner in Spark and was given an assignment to read data from csv and perform some query data using Spark Core. However, every online resource that I search uses some form of SQL from the pyspark.sql module.

Are there any way to read data and perform data query (select, count, group by) using only Spark Core?

CodePudding user response:

Spark Core is concept RDD. Here you can find more information and examples with processing some textfiles.

CodePudding user response:

its good practice to use Spark Dataframe instead Spark RDD.

Spark Dataframe uses catalyst optimizer which automatically calls out code internally in best way to improve performance.

https://blog.bi-geek.com/en/spark-sql-optimizador-catalyst/

  • Related