I am attempting to insert ~57,000 entities in my database, but the insert method takes longer and longer as the loop progresses. I have implemented batches of 25 - each time flushing, clearing, and closing the transaction (I'm pretty sure) without success. Is there something else I need to be doing in the code below to maintain the insert rate? I feel like it should not take 4 hours to insert 57K records.
[Migrate.java]
This is the main class that loops through 'Xaction' entities and adds 'XactionParticipant' records based off each Xaction.
// Use hibernate cursor to efficiently loop through all xaction entities
String hql = "select xaction from Xaction xaction";
Query<Xaction> query = session.createQuery(hql, Xaction.class);
query.setFetchSize(100);
query.setReadOnly(true);
query.setLockMode("xaction", LockMode.NONE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
Instant lap = Instant.now();
List<Xaction> xactionsBatch = new ArrayList<>();
while (results.next()) {
count ;
Xaction xaction = (Xaction) results.get(0);
xactionsBatch.add(xaction);
// save new XactionParticipants in batches of 25
if (count % 25 == 0) {
xactionParticipantService.commitBatch(xactionsBatch);
float rate = ChronoUnit.MILLIS.between(lap, Instant.now()) / 25f / 1000;
System.out.printf("Batch rate: %.4fs per xaction\n", rate);
xactionsBatch = new ArrayList<>();
lap = Instant.now();
}
}
xactionParticipantService.commitBatch(xactionsBatch);
results.close();
[XactionParticipantService.java]
This service provides a method with "REQUIRES_NEW" in an attempt to close the transaction for each batch
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void commitBatch(List<Xaction> xactionBatch) {
for (Xaction xaction : xactionBatch) {
try {
XactionParticipant xp = new XactionParticipant();
// ... create xp based off Xaction info ...
// Use native query for efficiency
String nativeQueryStr = "INSERT INTO XactionParticipant .... xp info/data";
Query q = em.createNativeQuery(nativeQueryStr);
q.executeUpdate();
} catch (Exception e) {
log.error("Unable to update", e);
}
}
// Clear just in case??
em.flush();
em.clear();
}
CodePudding user response:
That is not clear what is the root cause of your performance problem: java memory consumption or db performance, please check some thoughts below:
- The following code does not actually optimize memory consumption:
String hql = "select xaction from Xaction xaction";
Query<Xaction> query = session.createQuery(hql, Xaction.class);
query.setFetchSize(100);
query.setReadOnly(true);
query.setLockMode("xaction", LockMode.NONE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
Since you are retrieving full-functional entities, those entities get stored in persistence context (session-level cache), and in order to free memory up you need to detach entity upon entity has been processed (i.e. after xactionsBatch.add(xaction)
or // ... create xp based off Xaction info ...
), otherwise at the end of processing you consume the same amount of memory as you were doing List<> results = query.getResultList();
, and here I'm not sure what is better: consume all memory required at the start of transaction and release all other resources or keep cursor and jdbc connection open for 4 hours.
- The following code does not actually optimize JDBC interactions:
for (Xaction xaction : xactionBatch) {
try {
XactionParticipant xp = new XactionParticipant();
// ... create xp based off Xaction info ...
// Use native query for efficiency
String nativeQueryStr = "INSERT INTO XactionParticipant .... xp info/data";
Query q = em.createNativeQuery(nativeQueryStr);
q.executeUpdate();
} catch (Exception e) {
log.error("Unable to update", e);
}
}
yes, in general, JDBC should be faster than JPA API, however that is not your case - you are inserting records one-by-one instead of using batch inserts. In order to take advantage of batches your code should look like:
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void commitBatch(List<Xaction> xactionBatch) {
session.doWork(connection -> {
String insert = "INSERT INTO XactionParticipant VALUES (?, ?, ...)";
try (PreparedStatement ps = connection.prepareStatement(insert)) {
for (Xaction xaction : xactionBatch) {
ps.setString(1, "val1");
ps.setString(2, "val2");
ps.addBatch();
ps.clearParameters();
}
ps.executeBatch();
}
});
}
BTW, Hibernate may do the same if hibernate.jdbc.batch_size
is set to large enough positive integer and entities are properly designed (id generation is backed up by DB sequence and allocationSize is large enough)