Home > Back-end >  Why does Hibernate Search app take so much time to start?
Why does Hibernate Search app take so much time to start?

Time:07-26

I'm working on a Spring Boot App using Hibernate Search 5.11.10 Final and Lucene. Starting the app takes up to 2 minutes:

2022-07-25 14:53:05.381  INFO 16884 --- [  restartedMain] zuers.Application          : Starting Application using Java 11.0.14.1 on CVDE043 with PID 16884 (C:\blablapath\target\classes started by ladybug in C:\blablapath)
2022-07-25 14:53:05.381  INFO 16884 --- [  restartedMain] zuers.Application          : No active profile set, falling back to default profiles: default
2022-07-25 14:53:05.426  INFO 16884 --- [  restartedMain] .e.DevToolsPropertyDefaultsPostProcessor : Devtools property defaults active! Set 'spring.devtools.add-properties' to 'false' to disable
2022-07-25 14:53:05.426  INFO 16884 --- [  restartedMain] .e.DevToolsPropertyDefaultsPostProcessor : For additional web related logging consider setting the 'logging.level.web' property to 'DEBUG'
2022-07-25 14:53:06.166  INFO 16884 --- [  restartedMain] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2022-07-25 14:53:06.224  INFO 16884 --- [  restartedMain] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 49 ms. Found 1 JPA repository interfaces.
2022-07-25 14:53:06.795  INFO 16884 --- [  restartedMain] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8081 (http)
2022-07-25 14:53:06.805  INFO 16884 --- [  restartedMain] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2022-07-25 14:53:06.806  INFO 16884 --- [  restartedMain] org.apache.catalina.core.StandardEngine  : Starting Servlet engine: [Apache Tomcat/9.0.54]
2022-07-25 14:53:06.900  INFO 16884 --- [  restartedMain] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext
2022-07-25 14:53:06.901  INFO 16884 --- [  restartedMain] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 1474 ms
2022-07-25 14:53:06.927  INFO 16884 --- [  restartedMain] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2022-07-25 14:53:07.213  INFO 16884 --- [  restartedMain] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2022-07-25 14:53:07.220  INFO 16884 --- [  restartedMain] o.s.b.a.h2.H2ConsoleAutoConfiguration    : H2 console available at '/h2-console'. Database available at 'jdbc:h2:~/zuers-db-komplett'
2022-07-25 14:53:07.400  INFO 16884 --- [  restartedMain] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2022-07-25 14:53:07.488  INFO 16884 --- [  restartedMain] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 5.4.32.Final
2022-07-25 14:53:07.653  INFO 16884 --- [  restartedMain] o.hibernate.annotations.common.Version   : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
2022-07-25 14:53:07.870  INFO 16884 --- [  restartedMain] org.hibernate.dialect.Dialect            : HHH000400: Using dialect: org.hibernate.dialect.H2Dialect
2022-07-25 14:53:08.307  INFO 16884 --- [  restartedMain] org.hibernate.search.engine.Version      : HSEARCH000034: Hibernate Search 5.11.10.Final
2022-07-25 14:54:59.745  INFO 16884 --- [  restartedMain] o.h.e.t.j.p.i.JtaPlatformInitiator       : HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
2022-07-25 14:55:00.155  INFO 16884 --- [  restartedMain] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2022-07-25 14:55:00.838  WARN 16884 --- [  restartedMain] JpaBaseConfiguration$JpaWebConfiguration : spring.jpa.open-in-view is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure spring.jpa.open-in-view to disable this warning
2022-07-25 14:55:01.424  WARN 16884 --- [  restartedMain] ion$DefaultTemplateResolverConfiguration : Cannot find template location: classpath:/templates/ (please add some templates or check your Thymeleaf configuration)
2022-07-25 14:55:01.567  INFO 16884 --- [  restartedMain] o.s.b.d.a.OptionalLiveReloadServer       : LiveReload server is running on port 35729
2022-07-25 14:55:01.605  INFO 16884 --- [  restartedMain] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8081 (http) with context path ''
2022-07-25 14:55:01.614  INFO 16884 --- [  restartedMain] zuers.Application          : Started Application in 116.603 seconds (JVM running for 117.676)

My question is: What is happening between HSEARCH000034: Hibernate Search 5.11.10.Final and HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]? And is there any way this startup can go by faster?

Here is my applications.properties:

server.port=8081
spring.jpa.hibernate.ddl-auto=update
spring.h2.console.enabled=true
spring.datasource.url=jdbc:h2:~/zuers-db-komplett;AUTO_SERVER=TRUE
spring.jpa.properties.hibernate.search.lucene_version LATEST
spring.jpa.properties.hibernate.search.default.directory_provider = filesystem
spring.jpa.properties.hibernate.search.default.indexBase = ./var/lucene/indexes
spring.jpa.properties.hibernate.search.indexing_strategy = manual
spring.jpa.properties.hibernate.search.lucene.analysis_definition_provider = zuers.analyzer.AnalyzerProvider
spring.jpa.properties.hibernate.search.analyzer= strasseAnalyzer

Note: the database I'm working with has a size of 4GB

CodePudding user response:

What is happening

Hard to say for sure without access to a reproducer.

I would recommend trying to start your application with this property:

spring.jpa.properties.hibernate.search.autoregister_listeners = false

Don't be fooled by the (very bad) name of this property: it has nothing to do with listeners, in fact it Note this property was renamed to just hibernate.search.enabled in Hibernate Search 6 (which you should definitely consider upgrading to).

If, after setting this property, your application is still very slow to start, then your problem has nothing to do with Hibernate Search, so you can investigate other leads (slow database access?).

If, on the other hand, this solves your performance issue, then it definitely is Hibernate Search being slow. Which is curious, since Hibernate Search does little that is actually resource-intensive on startup. The worst I can imagine is Hibernate Search trying to create many indexes on the filesystem, like hundreds of them, if you have hundreds of indexed entity types in your application. But that would only be slow if your filesystem is slow to begin with... maybe your indexes are on a network share?

Beyond that, I'm not sure. Try setting the log level to TRACE for the org.hibernate.search category, maybe?

CodePudding user response:

During application startup, the Lucene indexer will create or update an index at the filesystem location:

/var/lucene/indexes

Basically two effects might be the cause for the observed loss of time:

  • (a) the size of the data source, ie number of tuples and/or attributes to be processed, or
  • (b) the IO throughput of the filesystem you are reading from or writing to.

Consider checking for IO bottlenecks (b), and if that can be ruled out that you don’t re-create the index with (massive amounts of) data every time the application starts. As the documentation warns, keep in mind that

reindexing is a potentially long-running task that should be triggered explicitly. To populate indexes with pre-existing data from the database, use mass indexing.

and that

dropping the schema means losing indexed data

Note: The hint in the other answer by @yrodiere should guide you to rule this option out.

  • Related