I'm having an issue with trying to access Hibernate entity relationships as part of an ItemProcessor
while running a Spring Batch job. The ItemProcessor
is part of a chunk-based step. As far as I can tell the ItemProcessor
runs in a transaction and therefore should be able to lazily load entity relationships.
The issue
I'm getting the following exception as part of the ItemProcessor
logic:
org.hibernate.LazyInitializationException: could not initialize proxy [org.powo.model.registry.Organisation#1] - no Session
at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:169)
at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:309)
at org.hibernate.proxy.pojo.bytebuddy.ByteBuddyInterceptor.intercept(ByteBuddyInterceptor.java:45)
at org.hibernate.proxy.ProxyConfiguration$InterceptorDispatcher.intercept(ProxyConfiguration.java:95)
at org.powo.model.registry.Organisation$HibernateProxy$OFnEWoXa.getIdentifier(Unknown Source)
at org.powo.model.solr.BaseSolrInputDocument.build(BaseSolrInputDocument.java:39)
at org.powo.model.solr.TaxonSolrInputDocument.<init>(TaxonSolrInputDocument.java:67)
at org.powo.model.Taxon.toSolrInputDocument(Taxon.java:1091)
at org.powo.job.reindex.TaxonToSolrInputDocumentProcessor.process(TaxonToSolrInputDocumentProcessor.java:20)
at org.powo.job.reindex.TaxonToSolrInputDocumentProcessor.process(TaxonToSolrInputDocumentProcessor.java:13)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.doProcess(SimpleChunkProcessor.java:126)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.transform(SimpleChunkProcessor.java:303)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:202)
at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:75)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:406)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:330)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:272)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:81)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:375)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:145)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:257)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:200)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:66)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:136)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:308)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:141)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Here's the ItemProcessor
for reference (and within the toSolrInputDocument
logic is where the entity relationships are traversed):
@Component
public class TaxonToSolrInputDocumentProcessor implements ItemProcessor<Taxon, SolrInputDocument> {
@Autowired
private ApplicationContext context;
@Override
public SolrInputDocument process(Taxon item) throws Exception {
return item.toSolrInputDocument(context);
}
}
And I'm using an org.springframework.batch.item.database.HibernatePagingItemReader
as the reader.
What I've tried
I've tried the following but none of the solutions have prevented the error above:
- using a
JpaPagingItemReader
instead ofHibernatePagingItemReader
but this still has the same issue - using
@Autowired
to get aSessionFactory
and then doingopenSession
/closeSession
around the code which traverses the entity relationships
Because of the data model I'm not able to fetch all relationships in one query so I need to use a stateful session (though I would like to fetch some!).
CodePudding user response:
AFAIK every step (read, process, write) creates a separate transaction for every chunk that is processed, so if your ItemReader
does not initialize an association that is needed in the processor or writer, you will run into this error because the entities become detached when the entity manager is closed after the transaction commit. I don't know if it is possible to tell Spring to keep an entity manager around for the whole process, but if you want to rely on lazy initialization, you will have to figure this out.
I would recommend you to make sure that you use proper join fetches though, to avoid the need for lazy initialization i.e. select t from Taxon t join fetch t.organization o
, or even better, use a DTO approach so that you don't have to care about lazy initialization at all.
I think this is a perfect use case for Blaze-Persistence Entity Views.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
@EntityView(Taxon.class)
public interface TaxonDto {
@IdMapping
Long getId();
String getName();
OrganizationDto getOrganization();
@EntityView(Organization.class)
interface OrganizationDto {
@IdMapping
Long getId();
String getName();
}
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
TaxonDto a = entityViewManager.find(entityManager, TaxonDto.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
Page<TaxonDto> findAll(Pageable pageable);
The best part is, it will only fetch the state that is actually necessary!
CodePudding user response:
I've now resolved this - in the end changing to JpaPagingItemReader
did the trick. I'm not sure why it didn't work previously, although I did find that setting any kind of taskExecutor
did seem to immediately break the lazy loading.