I'm trying to run a Couchbase query in an aws glue job, using the spark couchbase connector. My query is a simple N1Ql query against the existing Couchbase bucket:
var queryResultRDD: RDD[CouchbaseQueryRow] = spark.sparkContext.couchbaseQuery(N1qlQuery.simple(CbN1qlQuery))
If I have a long running query, I receive the following error message:
Caused by: com.couchbase.client.java.error.QueryExecutionException: Timeout 1m15s exceeded
at com.couchbase.spark.connection.QueryAccessor$$anonfun$compute$1$$anonfun$apply$2$$anonfun$4.apply(QueryAccessor.scala:56)
at com.couchbase.spark.connection.QueryAccessor$$anonfun$compute$1$$anonfun$apply$2$$anonfun$4.apply(QueryAccessor.scala:53)
at rx.lang.scala.Observable$$anon$32.call(Observable.scala:1324)
The 1m15s timeout setting likely comes from Couchbase's 75s default query timeout, so I tried to add the query timeout setting directly to the query call hoping that it would override the default timeout setting:
var queryResultRDD: RDD[CouchbaseQueryRow] = sc.couchbaseQuery(N1qlQuery.simple(CbN1qlQuery), "bucket-name", Some(Duration(130, SECONDS)))
Dropping that duration down to something impossibly low like 1 ms resulted in a different query timeout error. However, if I made the duration longer as shown above I still received the same QueryExecutionException where the timeout was still 1m15s. I also tried to set the System Property in the Glue job script:
System.setProperty("com.couchbase.env.timeout.queryTimeout", "1ms")
Yet I still received the same 1m15s timeout error. I also tried setting the spark.couchbase.timeout.queryTimeout
property in the same way and got the same result. I also tried setting the query timeout configuration in the sparkSession builder with no change:
val Spark = SparkSession
.builder()
.appName(DefaultName)
.config("spark.couchbase.nodes", CbNodes)
.config(s"spark.couchbase.bucket.$SourceBucketName", SourceBucketPassword)
.config("spark.couchbase.username", SourceBucketUserName)
.config("spark.couchbase.password", SourceBucketPassword)
.config("spark.ssl.enabled", CbSslEnabled)
.config("spark.ssl.keyStore", CbKeyStore)
.config("spark.ssl.keyStorePassword", CbKeyStorePassword)
.config("spark.couchbase.timeout.queryTimeout", "1ms")
.getOrCreate()
How do I override this 1m15s query timeout setting?
CodePudding user response:
I figured it out after looking at the couchbaseconfig class what the config is supposed to be:
.config("com.couchbase.queryTimeout", "10")
where the querytimeout value is in ms.