The problem of hive spark operation-CodePudding

Recently received a task, is a hive table associated with multiple other table generating a common table, SQL is I write well,
Asked me to write down the spark above, could you tell me how to use the spark SQL implementation?
I'm using the scala for eclipse editor to write, but I hive on the test code is not connected to the server,
I don't know how this connection to connect, the code does but also with a lot of the environment?

CodePudding user response:

With spark SQL, you can directly read the original data on the hive in HDFS, read into the data, and then register into memory table, and then execute SQL, the data storage to the HDFS, then your results based on the result data hive table

CodePudding user response:

Run directly on the command line invocation spark - shell
Scala> The import org. Apache. Spark. SQL. SQLContext
Scala> Val sqlContext=new sqlContext (sc)
Scala> Val res=sqlContext. SQL (" the select current_date ")
Scala> Res. The show ()