Home > other >  The spark output log information in how to filter the INFO?
The spark output log information in how to filter the INFO?

Time:09-26

I according to the online method in file log4j properties configured in the
# Set everything to be logged to the console
Log4j rootCategory=WARN, the console
Log4j. Appender. The console=org. Apache. Log4j. ConsoleAppender

So use the spark - shell, you can see only warn information output, very concise,
The worker. The worker - 1 - Lin - spark. Out
Lin @ Lin - spark:/opt/data01/spark - 1.3.0 - bin - server - cdh5.4.0 $bin/spark - shell
Spark an assembly has had been built with Hive, o Datanucleus jars on the classpath
16/05/21 10:56:52 WARN NativeCodeLoader: Unable to load the native - hadoop library for your platform... Using the builtin - Java classes where applicable
Welcome to
____ __
/__/__ _____ _____//__
/_ \ \ _ \ _ `/__/__/
'/___/__/\ _, _/_/_ \ _ \ version 1.3.0
/_

Using the Scala version 2.10.4 (Java HotSpot (TM), 64 - Bit VM Server, Java 1.8.0 comes with _05)
Type in expressions and them evaluated.
Type: help for more information.
16/05/21 10:56:56 WARN Utils: Your hostname, Lin - spark resolves to a loopback address: 127.0.1.1; Using 10.170.56.63 home (on interface eth0)
16/05/21 10:56:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
The Spark context available as sc.
SQL context available as sqlContext.

But after using the IDEA of writing the code run, still a lot of INFO, this is how to return a responsibility, how to deal with?

6/05/21 10:57:52 INFO MemoryStore: Block broadcast_52_piece0 stored as bytes in memory (estimated size 2.0 KB, free 253.4 MB)
16/05/21 10:57:52 INFO BlockManagerInfo: Added broadcast_52_piece0 in memory on localhost: 56191 (size: 2.0 KB, free: 256.8 MB)
16/05/21 10:57:52 INFO BlockManagerMaster: Updated INFO of block broadcast_52_piece0
16/05/21 10:57:52 INFO SparkContext: Created broadcast 52 from broadcast at DAGScheduler. Scala: 839
16/05/21 10:57:52 INFO DAGScheduler: date 1 missing from Stage 39 (MapPartitionsRDD [98] at the map at homework3. Scala: 67)
16/05/21 10:57:52 INFO TaskSchedulerImpl: Adding task set with 39.0 1 tasks
16/05/21 10:57:52 INFO TaskSetManager: Starting task in stage 0.0 39.0 (654, dar localhost, PROCESS_LOCAL, 1322 bytes)
16/05/21 10:57:52 INFO Executor: Running task in stage 0.0 39.0 (dar 654)
16/05/21 10:57:52 INFO HadoopRDD: Input the split: file:/opt/data02 sparkApp/IndexSearch IRdata/reut2-007 _491:0 + 4503
16/05/21 10:57:52 INFO Executor, Finished the task in stage 0.0 39.0 (dar) 654. 1845 bytes of the result sent to driver
16/05/21 10:57:52 INFO TaskSetManager: Finished the task in stage 0.0 39.0 (dar) 654 in 54 ms on localhost (1/1)
16/05/21 10:57:52 INFO TaskSchedulerImpl: Removed the TaskSet 39.0, whose tasks have all completed, the from the pool
16/05/21 10:57:52 INFO DAGScheduler: Stage. 39 (68) first at homework3. Scala: finished in 0.054 s
16/05/21 10:57:52 INFO DAGScheduler: 29 finished Job: first the at homework3. Scala: 68, took 0.056794 s

CodePudding user response:

Add: complete log4j properties file for:

# Set everything to be logged to the console
Log4j rootCategory=WARN, the console
Log4j. Appender. The console=org. Apache. Log4j. ConsoleAppender
. Log4j appenders. Console. Target=System. Err
Log4j. Appender. Console. Layout=org.. Apache log4j. PatternLayout
Log4j. Appender. Console. Layout. ConversionPattern=% d {yy/MM/dd HH: MM: ss} % p % c {1} : % m % n

# Settings to the -quiet third party logs that are too verbose
Log4j.logger.org.eclipse.jetty=WARN
Log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
Log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

CodePudding user response:

Log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
Log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

Here directly to
Log4j.logger.org.apache.spark=WARN

See $SPARK_HOME/conf/log4j. Log4j. Properties

Without cp log4j. Properties. The template log4j. Properties

You that a lot of info output are the spark, actually does not recommend shielding,
  • Related