Home > Mobile >  How to Set Log Level for Third Party Jar in Spark
How to Set Log Level for Third Party Jar in Spark

Time:03-09

I use Spark to write data from Hive Table to Kinetica using this jar: kinetica-spark-7.0.6.1-jar-with-dependencies.jar. However, when I run spark-submit, the logger from the jar is printing the JDBC connection string with its credentials as follows:

...
22/03/03 03:00:58 INFO spark.LoaderParams: Using JDBC connection string: jdbc:kinetica://10.xx.xx.xx:9191;UID=xxx;PWD=xxx
22/03/03 03:00:58 INFO spark.LoaderParams: Installing truststore to bypass certificate check.
22/03/03 03:00:58 INFO spark.LoaderParams: Using URL(s) http://10.xx.xx.xx:9191to create a GPUdb connection
22/03/03 03:00:59 INFO spark.LoaderParams: Connecting to http://10.xx.xx.xx:9191 as user <xxx>
22/03/03 03:00:59 INFO spark.ContextCleaner: Cleaned accumulator 47
22/03/03 03:00:59 INFO spark.ContextCleaner: Cleaned accumulator 42
22/03/03 03:00:59 INFO spark.ContextCleaner: Cleaned accumulator 53
22/03/03 03:00:59 INFO spark.ContextCleaner: Cleaned accumulator 52
...

which is coming from this piece of code: kinetica.spark.LoaderParams

I want to set the log level specifically for this spark.LoaderParams class to a higher level so that the connection string is not showing anywhere in the log.

Is there any way that I can do that?

Here's my log4j.properties config:

log4j.rootLogger=${root.logger}
root.logger=INFO,console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
shell.log.level=WARN
log4j.logger.org.eclipse.jetty=INFO
log4j.logger.org.spark-project.jetty=INFO
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
log4j.logger.org.apache.spark.repl.Main=${shell.log.level}
log4j.logger.org.apache.spark.api.python.PythonGatewayServer=${shell.log.level}

log4j.logger.jobLogger=INFO, RollingAppenderU
log4j.appender.RollingAppenderU=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppenderU.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppenderU.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppenderU.layout.ConversionPattern=[%p] %d %c %M - %m%n
log4j.appender.fileAppender.MaxFileSize=1MB
log4j.appender.fileAppender.MaxBackupIndex=1

CodePudding user response:

In my configuration, I use the following to log the LoaderParams statements at WARN and everything else from the Kinetica Spark connector at INFO:

log4j.logger.com.kinetica.spark=INFO
log4j.logger.com.kinetica.spark.LoaderParams=WARN
  • Related