Home > OS >  Installing Apache Spark Packages to run Locally
Installing Apache Spark Packages to run Locally

Time:05-19

I am looking for a clear guide or steps to installing Spark packages (specifically spark-avro) to run locally and correctly using them with spark-submit command.

I've spent a lot of time reading many posts and guides, but still not able to get spark-submit to use the locally deployed spark-avro package. Hence, if someone has already accomplished this with spark-avro or another package, please share your wisdom :)

All the existing documentation I found is a bit unclear.

Clear steps and examples would be much appreciated! P.S. I know Python/PySpark/SQL, but not much Java (yet) ...

Michael

CodePudding user response:

In spark-submit command itself you can pass avro package details (make sure avro and spark version support)

spark-submit --packages org.apache.spark:spark-avro_<required_version>:<spark_version>

Example,

spark-submit --packages org.apache.spark:spark-avro_2.11:2.4.0

same way you can pass it along with spark-shell command as well to work on avro files.

  • Related