Home > Software engineering >  Spark fails to read from Elasticsearch/Opensearch. Invalid map received dynamic_date_formats
Spark fails to read from Elasticsearch/Opensearch. Invalid map received dynamic_date_formats

Time:08-26

Hi I'm trying using scala 2.11.12, spark 2.3.0 and elasticsearch-spark-20 7.7.0 to read from an OpenSearch 1.3.4 Index with the following code:

spark.read.format("org.elasticsearch.spark.sql")
      .load("myIndex")
      .filter('Timestamp === lit(dateToRead))

But I get this error

22/08/17 15:30:42 ERROR EventManager$: Unexpected error retrieving offsets. Bailing out...
Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: invalid map received dynamic_date_formats=[yyyy-MM-dd HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS||yyyy-MM-dd||yyyy-MM-dd'T'HH||yyyy-MM-dd'T'HH:mm]
    at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseField(FieldParser.java:146)
    at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseMapping(FieldParser.java:88)
    at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseIndexMappings(FieldParser.java:69)
    at org.elasticsearch.hadoop.serialization.dto.mapping.FieldParser.parseMappings(FieldParser.java:40)
    at org.elasticsearch.hadoop.rest.RestClient.getMappings(RestClient.java:321)
    at org.elasticsearch.hadoop.rest.RestClient.getMappings(RestClient.java:307)
    at org.elasticsearch.hadoop.rest.RestRepository.getMappings(RestRepository.java:293)
    at org.elasticsearch.spark.sql.SchemaUtils$.discoverMappingAndGeoFields(SchemaUtils.scala:103)
    at org.elasticsearch.spark.sql.SchemaUtils$.discoverMapping(SchemaUtils.scala:91)
    at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema$lzycompute(DefaultSource.scala:229)
    at org.elasticsearch.spark.sql.ElasticsearchRelation.lazySchema(DefaultSource.scala:229)
    at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:233)
    at org.elasticsearch.spark.sql.ElasticsearchRelation$$anonfun$schema$1.apply(DefaultSource.scala:233)
    at scala.Option.getOrElse(Option.scala:121)
    at org.elasticsearch.spark.sql.ElasticsearchRelation.schema(DefaultSource.scala:233)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
    at com.MyCalass$$anonfun$myMethod2$1.apply(MyCalass.scala:130)
    at com.MyCalass$$anonfun$myMethod2$1.apply(MyCalass.scala:126)
    at scala.util.Try$.apply(Try.scala:192)
    at com.MyCalass$.myMethod2(MyCalass.scala:126)
    at com.MyCalass$.myMethod(MyCalass.scala:55)
    at com.MyApp$.MyApp$$myMethod(MyApp.scala:107)
    at com.MyApp$$anonfun$main$2.apply(MyApp.scala:86)
    at com.MyApp$$anonfun$main$2.apply(MyApp.scala:76)
    at scala.Option.fold(Option.scala:158)
    at com.MyApp$.main(MyApp.scala:76)
    at com.MyApp.main(MyApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Command exiting with ret '1'

I've set the dynamic date mapping in opensearch. And also I am able to write to the index with the correct mapping, but when I try to read, it fails.

CodePudding user response:

I found out the problem, basically The elasticsearch connector is not working properly and it tries to use ES 1.3.4 instead of Opensearch 1.3.4, to solve this problem add compatibility.override_main_response_version : true to your opensearch.yml file.

  • Related