Home > other >  If there is a correlation data, can use spark& The hadoop distributed processing calculation data?
If there is a correlation data, can use spark& The hadoop distributed processing calculation data?

Time:09-22

If there is a correlation data, can use spark& The hadoop distributed processing calculation data? Or have other shorten the calculation time frame or method?

CodePudding user response:

Data link connection you mean similar to relational database query (join)? Hadoop above there are a lot of SQL on Hadoop components, such as the Hive Spark Impala phoenix, Hadoop here only responsible for storage and job scheduling
  • Related