Home > other >  The SPARK SQL execution efficiency is not high?
The SPARK SQL execution efficiency is not high?

Time:09-27

# test. Py
Df=sqlContext \
. Read \
. The format (" JDBC ") \
Option (" url ", url) \
Option (" dbtable ", "stock_detail_collect_20151105_1130") \
The load ()


Df. RegisterTempTable (" people ")

CountsByAge=sqlContext. SQL (" select stock_id, count (*) as ct from people group by stock_id order by stock_id desc ")

=====================

./spark - submit -- -- driver - class - path.. Java/mariadb - - the client - 1.3.6. Jar../test. Py

The above code execution is not directly in the database (MariaDB) fast executing SQL statements, then under the Linux top command execution process of resource usage, multi-core CPU not all use, only one core, 100%,
Question: 1. Under the LAN environment, is not directly in the database quickly?
2. How to use multi-core processing? Add - executor - cores 2 is useless, the top view is still the same,
  • Related