What is the best practice to create repository on Spring Boot?-CodePudding

I want to create one to many mapping like Post has many Comments. I have two solutions for adding comments. The first solution is to create a repository for the comment and the second solution is to use PostRepository and get post and add comment to the post. Each solution has its own challenges.

In the first solution, creating repositories per entity increases the number of repositories too much and based on the DDD, repositories should be created for Aggregate Roots.

In the second solution, there are performance issues. To load, add or remove nested entities, the root entity must be loaded first. To add entity, other related entities like User Entity in Comment Entity must be loaded from userRepository. As a result, these additional loadings cause a decrease in speed and total performance.

What is the best practice to load, add or remove nested entities?

File Post.java

@Entity
@Table(name = "posts")
@Getter
@Setter
public class Post
{
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Size(max = 250)
    private String description;

    @NotNull
    @Lob
    private String content;

    @OneToMany(mappedBy = "post", fetch = FetchType.LAZY, cascade = CascadeType.ALL)
    private Set<Comment> comments = new HashSet<>();

    @ManyToOne(fetch = FetchType.LAZY, optional = false)
    @JoinColumn(name = "user_id", nullable = false)
    @OnDelete(action = OnDeleteAction.CASCADE)
    private User user;
}

File Comment.java

@Entity
@Table(name = "comments")
@Getter
@Setter
public class Comment {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @NotNull
    @Lob
    private String text;

    @ManyToOne(fetch = FetchType.LAZY, optional = false)
    @JoinColumn(name = "post_id", nullable = false)
    @OnDelete(action = OnDeleteAction.CASCADE)
    private Post post;

    @ManyToOne(fetch = FetchType.LAZY, optional = false)
    @JoinColumn(name = "user_id", nullable = false)
    @OnDelete(action = OnDeleteAction.CASCADE)
    private User user;
}

@Entity
@Table(name = "Users")
@Getter
@Setter
public class User
{   
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id; 

    @OneToMany(mappedBy = "user", fetch = FetchType.LAZY, cascade = CascadeType.ALL)
    private Set<Comment> comments = new HashSet<>();

    @OneToMany(mappedBy = "user", fetch = FetchType.LAZY, cascade = CascadeType.ALL)
    private Set<Post> posts = new HashSet<>();
}

CodePudding user response：

"best" is not well defined. But here is what is probably to be considered the canonic stance the Spring Data Team has on this question.

You definitely should NOT have one repository per entity (s. Are you supposed to have one repository per table in JPA?).

The reason is certainly not that you'd have to many classes/interfaces. Classes and interfaces are really cheap to create both at implementation time and at run time. It is kind of hard to have so many of them that it poses a significant problem. And if it would, already the entities would cause a problem.

The reason is that repositories handle aggregates, not entities. Although, admittedly the difference is hard to see in JPA based code. So your question boils down to: What should be an aggregate.

At least part of the answer is already in your question:

In the second solution, there are performance issues. To load, add or remove nested entities, the root entity must be loaded first. To add entity, other related entities like User Entity in Comment Entity must be loaded from userRepository. As a result, these additional loadings cause a decrease in speed and total performance.

The concepts of aggregate and repository are widely adopted in the microservice community because they lead to good scalability. This certainly isn't the same as "speed and total performance" but certainly related.

So how go these two view together? Andrey B. Panfilov is onto something with their comment:

@OneToMany is actually @OneToFew like "person may be reachable by a couple of phone numbers". But it only describes a heuristic.

The real rule is: An aggregate should group classes that need to be consistent at all times. The canonical example is a purchase order with its line items. Line items on their own don't make sense. And if you modify a line item (or add/remove one) you might have to update the purchase order, for example in order to update the total price or in order to maintain constraints like a maximum value. So purchase order should be an aggregate including its line items.

This also means that you need to completely load an aggregate. This in turn means that it can't be to big, because otherwise you'd run into performance problems.

In your example of Post, Comment, and User, Post might form an aggregate with Comment. But in most systems the number of comments is close to unlimited and can be huge. I therefore would vote for making each entity in your example its own aggregate.

For more input about aggregates and repositories you might find Spring Data JDBC, References, and Aggregates interesting. It is about Spring Data JDBC not Spring Data JPA, but the conceptual ideas do apply.

CodePudding user response：

N 1 problem: fetch data in loop and If you have 2000 data for posts and comments, you need to avoid to fetch for each data.

// Ex: 2000 posts is fetched
for(Post post: userRepository.findById("1").getPosts()) {
   // fetching in loop: you go to database for each post(2000) and get comments of posts.
   Set<Comment> comments = post.getComments();
}

Solution: create a repository for Post and fetch with custom repository. There are a lot of way to fetch eagerly. Ex: EntityGraph, FetchType.EAGER, JPQL ...

@Query(value = "select p from Post p fetch left join p.comments c where p.id=:postId)
public Set<Post> postsWithComments(@Param("postId") Long postId)

Set<Post> posts = postRepository.postWithComments(1L);

Even you need to be careful when fetching data eagerly, If there are a lot of comments for post simply use another repository for Comment.

public Set<Comment> findByPostId(String postId);
Set<Comment> comments = commentRepository.findByPostId(1L);

Even if there are 60000 comments for a single post. you need to fetch with pagination which can be helpful in critical times.

public Page<Comment> findByPostId(Long postId, Pageable pageable);
Page<Comment> comments = commentRepository.findByPostId(1L, PageRequest.of(2000));
int loopCounter = comments.getTotalElements() % 2000 == 0 ? comments.getTotalElements() / 2000 : comments.getTotalElements() / 2000   1;
int i=1;
do{

   // do something
   i  ;
}while(i <= loopCounter);

For further things you need to use cache strategies for improving performance.

Also you need to define what can be the response time of request and what is actual response time. You can use fetch with left join or simply another request. In the long running processes you can use async operations as well.