I have an application that uses Hibernate and it's running out of memory with a medium volume dataset (~3 million records). When analysing the memory dump using Eclipse's Memory Analyser I can see that StatefulPersistenceContext
appears to be holding a copy of the record in memory in addition to the object itself, doubling the memory usage.
I'm able to reproduce this on a slightly smaller scale with a defined workflow, but am unable to simplify it to the level that I can put the full application here. The workflow is:
- Insert ~400,000 records (
Fruit
) into the database from a file - Get all of the
Fruit
s from the database and find if there are any complementary items to create ~150,000Baskets
(containing twoFruit
s) - Retrieve all of the data -
Fruits
&Baskets
- and save to a file
It's running out of memory at the final stage, and the heap dump shows StatefulPersistenceContext
has hundreds of thousands of Fruit
s in memory, in addition to the Fruit
s we retrieved to save to the file.
I've looked around online and the suggestion appears to be to use QueryHints.READ_ONLY
on the query (I put it on the getAll
), or to wrap it in a Transaction
with the readOnly
property set - but neither of these seem to have stopped the massive StatefulPersistenceContext
.
Is there something else I should be looking at?
Examples of the classes / queries I'm using:
public interface ShoppingService {
public void createBaskets();
public void loadFromFile(ObjectInput input);
public void saveToFile(ObjectOutput output);
}
@Service
public class ShoppingServiceImpl implements ShoppingService {
@Autowired
private FruitDAO fDAO;
@Autowired
private BasketDAO bDAO;
@Override
public void createBaskets() {
bDAO.add(Basket.generate(fDAO.getAll()));
}
@Override
public void loadFromFile(ObjectInput input) {
SavedState state = ((SavedState) input.readObject());
fDAO.add(state.getFruits());
bDAO.add(state.getBaskets());
}
@Override
public void saveToFile(ObjectOutput output) {
output.writeObject(new SavedState(fDAO.getAll(), bDAO.getAll()));
}
public static void main(String[] args) throws Throwable {
ShoppingService service = null;
try (ObjectInput input = new ObjectInputStream(new FileInputStream("path\\to\\input\\file"))) {
service.loadFromFile(input);
}
service.createBaskets();
try (ObjectOutput output = new ObjectOutputStream(new FileOutputStream("path\\to\\output\\file"))) {
service.saveToFile(output);
}
}
}
@Entity
public class Fruit {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
private String name;
// ~ 200 string fields
}
public interface FruitDAO {
public void add(Collection<Fruit> elements);
public List<Fruit> getAll();
}
@Repository
public class JPAFruitDAO implements FruitDAO {
@PersistenceContext
private EntityManager em;
@Override
@Transactional()
public void add(Collection<Fruit> elements) {
elements.forEach(em::persist);
}
@Override
public List<Fruit> getAll() {
return em.createQuery("FROM Fruit", Fruit.class).getResultList();
}
}
@Entity
public class Basket {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
@OneToOne
@JoinColumn(name = "arow")
private Fruit aRow;
@OneToOne
@JoinColumn(name = "brow")
private Fruit bRow;
public static Collection<Basket> generate(List<Fruit> fruits) {
// Some complicated business logic that does things
return null;
}
}
public interface BasketDAO {
public void add(Collection<Basket> elements);
public List<Basket> getAll();
}
@Repository
public class JPABasketDAO implements BasketDAO {
@PersistenceContext
private EntityManager em;
@Override
@Transactional()
public void add(Collection<Basket> elements) {
elements.forEach(em::persist);
}
@Override
public List<Basket> getAll() {
return em.createQuery("FROM Basket", Basket.class).getResultList();
}
}
public class SavedState {
private Collection<Fruit> fruits;
private Collection<Basket> baskets;
}
CodePudding user response:
Have a look at this answer here... How does Hibernate detect dirty state of an entity object?
Without access to the heap dump or your complete code, I would believe that you are seeing exactly what you are saying that you see. As long as hibernate believes that it is possible that the entities will change, it keeps a complete copy in memory so that it can compare the current state of the object to the state as it was originally loaded from the database. Then at the end of the transaction (the transactional block of code), it will automatically write the changes to the database. In order to do this, it needs to know what the state of the object used to be in order to avoid a large number of (potentially expensive) write operations.
I believe that setting the transaction-block so that it is read-only is a step on the right-track. Not completely sure, but I hope the information here helps you at least understand why you are seeing large memory consumption.
CodePudding user response:
1: Fetching all Fruits at once from DB, or Persisting large set of bucket once will impact DB performance as well as application performance because of huge objects in Heap memory (young gen Old gen based on Object survive in heap). Use batch process instead of processing all data once. use spring batch or implement or a custom logic to process data in set of chunks.
2: The persistence context stores newly created and modified entities in memory. Hibernate sends these changes to the database when the transaction is synchronized. This generally happens at the end of a transaction. However, calling EntityManager.flush() also triggers a transaction synchronization. Secondly, the persistence context serves as an entity cache, also referred to as the first level cache. To clear entities in the persistence context, we can call EntityManager.clear().
Can take ref for batch processing from here.
3.If you don't plan on modifying Fruit, you could just fetch entries in read-only mode: Hibernate will not retain the dehydrated state which it normally uses for the dirty checking mechanism. So, you get half the memory footprint.