spring repository findAll() too slow-CodePudding

I'm experiencing a strange slowness in my new spring boot application, when I fetch a table for reporting purposes.

This is a simple table, no reference to other tables, and it has 5 columns. Rows are 50k. So, I use the simple findAll() method, which is available in JpaRepository.

When the "destination" result is the entity, the findAll() execution takes 5 minutes. When I setup DTO class projection, or interface projection, the execution takes 1-2 minutes.

I believe this is still too much for that amount of data.

Moreover hibernate statistics provides execution time like 0.5 seconds. What takes the remaining 1-2 minutes to get the data in DTO type?

CodePudding user response：

@Benda has given good pointers and It would be helpful to attach jvisualvm and use CPU profiling to see where is the time spent. Since the rows are large in number drivers generally make multiple round trips to fetch large amount of data to avoid out of memory scenario and while they are being loaded by hibernate - it gets accumulated it in fist level cache. I once came across a scenario where the transaction was not marked readonly and so at the end Hibernate was trying to dirty check to see if something changed. Also I would recommend you to integrate Javamelody. It's free, open-source and super easy to setup with Spring Boot ( just include a dependency in build). It can tell you the time spent in each use case with service level details and SQL timings - all with a good visual dashboard.

CodePudding user response：

Certainly the serialization/deserialization process which is cumbersome and need too much memory. You can do many things about that :

Increase the memory params of your application. See JVM memory options.
Use a java library which can handle serialization faster/better than default JDK system. Jackson is pretty good.
Put a cache system in place to store objects and retrieve them faster. You can use Spring @Cacheable annotation, @see https://www.baeldung.com/spring-cache-tutorial, or you can use Google guava libs to get a pretty good cache system.
Optimize your datamodel to have faster request/responses. See loading objects with Fetch.EAGER / Fetch.LAZY diffs and how you can do to get an optimized way to go for your application.
Use Pagination explained very well by John Thompson. It will be a faster way to retrieve a small set of results and let the users navigate inside them from 10 to 10 or 50 to 50.

With pagination example, you can achieve a fast display of your dataset in your spring boot application :

public interface ProductRepository extends PagingAndSortingRepository<Product, Integer> {

    List<Product> findAllByPrice(double price, Pageable pageable);
}

Starting point code from https://www.baeldung.com/spring-data-jpa-pagination-sorting, with instructions you can follow to get it working pretty well.

CodePudding user response：

I had a similar issue where Spring data was doing a select count(*) internally on my table before it did the findAll() To avoid this I overrided count()

This may or may not help your specific case.

import org.springframework.data.jpa.repository.JpaRepository;

public interface MyRepository extends JpaRepository<MyEntity,Integer> {
    
    @Override
    default long count(){
        //counts on very large tables take forever. Optionally add this
        return 0;
    }
}

The explain plan of the select count(*) in my case was huge and it look a long time when running the query manually.

explain select count(*) from MY_TABLE