Hi I'm trying using scala 2.11.12, spark 2.3.0 and elasticsearch-spark-20 7.7.0 to read from an OpenSearch 1.3.4 Index with the following code:
spark.read.format("org.elasticsearch.spark.sql")
.load("myIndex")
.filter('Timestamp === lit(dateToRead))
But I get this error
22/08/17 15:30:42 ERROR EventManager$: Unexpected error retrieving offsets. Bailing out...
Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: invalid map received dynamic_date_formats=[yyyy-MM-dd HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS||yyyy-MM-dd||yyyy-MM-dd'T'HH||yyyy-MM-dd'T'HH:mm]
at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseField(FieldParser.java:146)
at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseMapping(FieldParser.java:88)
at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseIndexMappings(FieldParser.java:69)
at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseMappings(FieldParser.java:40)
at org.elasticsearch.hadoop.rest.RestClient.getMappings(RestClient.java:321)
at org.elasticsearch.hadoop.rest.RestClient.getMappings(RestClient.java:307)
at org.elasticsearch.hadoop.rest.RestRepository.getMappings(RestRepository.java:293)
at org.elasticsearch.spark.sql.SchemaUtils$.discoverMappingAndGeoFields(SchemaUtils.scala:103)
at org.elasticsearch.spark.sql.SchemaUtils$.discoverMapping(SchemaUtils.scala:91)
at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema$lzycompute(DefaultSource.scala:229)
at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema(DefaultSource.scala:229)
at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:233)
at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:233)
at scala.Option.getOrElse(Option.scala:121)
at org.elasticsearch.spark.sql.ElasticsearchRelation.schema(DefaultSource.scala:233)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
at com.MyCalass$$anonfun$myMethod2$1.apply(MyCalass.scala:130)
at com.MyCalass$$anonfun$myMethod2$1.apply(MyCalass.scala:126)
at scala.util.Try$.apply(Try.scala:192)
at com.MyCalass$.myMethod2(MyCalass.scala:126)
at com.MyCalass$.myMethod(MyCalass.scala:55)
at com.MyApp$.MyApp$$myMethod(MyApp.scala:107)
at com.MyApp$$anonfun$main$2.apply(MyApp.scala:86)
at com.MyApp$$anonfun$main$2.apply(MyApp.scala:76)
at scala.Option.fold(Option.scala:158)
at com.MyApp$.main(MyApp.scala:76)
at com.MyApp.main(MyApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Command exiting with ret '1'
I've set the dynamic date mapping in opensearch. And also I am able to write to the index with the correct mapping, but when I try to read, it fails.
CodePudding user response:
I found out the problem, basically The elasticsearch connector is not working properly and it tries to use ES 1.3.4 instead of Opensearch 1.3.4, to solve this problem add compatibility.override_main_response_version : true to your opensearch.yml file.