It is a bit unclear to me how and if to use an in-memory cache.
I am developing a Java application (Minecraft Spigot plugin) that creates a medieval MMO-type game, where there are a lot of objects (Town, Structures and items with specific properties etc.).
Many of these objects are read and modified a lot, which made me look into a caching solution to prevent these objects from begin fetched from a MySQL database (which might not be the best solution for MMO's, but I'd like to stick with it for now).
So for my specific situation I'd assume the in-memory caching approach seems the right option to pick. I've been searching and digging through lots of information about caching and cache providers, but it's still unclear to me if in-memory caching requires to retrieve and cache an object each time it is being modified.
This is an important part for me to understand, since some objects are continuously modified. Let's take a town gate for example. The Gate object has a health property, which changes when a player damages the gate. In my understanding, fetching the gate object, altering the health property and caching the modified object each time a player damages the gate seems a little cumbersome. Especially given the fact that most gate objects have a default health of 500, and the most damage-output from a player can be 20. This would mean that in the case of a player wishing to destroy the gate object, the object has to be fetched, modified and cached back 25 times in a matter of minutes.
Would it at all be possible to just fetch the object from the cache (given the object has been retrieved from the DB and cached before) modify it and leave it like that? This would of course mean that the cache would hold a reference to the object and when modified the cache would refer to the modified instance. If this is possible, is it a good approach performance-wise?
To summarize my questions:
- Would an in-memory (key-value) cache be logical given my situation (lots of objects being modified frequently)?
- Would it be possible and efficient (performance-wise) to hold references to cached objects in a cache?
Thanks in advance for your help!
CodePudding user response:
Yes, caching is needed. Making SQL request is too long. Such as it's saved in RAM, it's clearly faster.
How can you save them ?
- Simply use Map.
You have to choose the type of map according to the type of your data, specially the key.
You can get more informations about it here or here.
Use cache manager such as explain here with MapMaker (from Guava)
You can make your own cache manager. Specially cleaning sometimes, like that:
public HashMap<UUID, Data> datas = new HashMap<>(); // here you have to choose the good map type
public void onEnable() {
getServer().getScheduler().runTaskTimerAsynchronously(this, () -> {
synchronized(datas) { // prevent used from others thread
datas.values().forEach(Data::save); // save all data
datas.clear();
}
}, 20 * 60, 20 * 60); // in ticks, 20 ticks = 1s so 20*60 = 1min
}
Here, each minute you save all object and clear them. Clearing is to prevent too many data.
I suggest you to combine multiple things : Use cacher, and save when it's not used for long time.
CodePudding user response:
Interesting question, since it touches many aspects of performance engineering, concurrent programming and especially consistency via performance trade offs.
It is possible to enable caching in Hibernate and configure write behind caching this is, for example, possible with EHCache. However, there is still considerable overhead, since Hibernate and the database is designed for transactional workloads and consistency.
I am going to present a potentially optimal solution based on cache2k. I do leverage some feature in cache2k that are not present in other caches, however, some of the techniques I use here can be used with other cache implementations as well like EHCache, Guava Cache, Caffeine or a JCache/JSR107 that supports store by reference.
Since you run on a single server, working with in memory data is most efficient. Writes can be skipped or delayed, since you do not have transactional data like bank accounts. In case of a server crash it is tolerable to loose a tiny bit of status updates. It always is a trade off between update performance and potential data loss in case of a crash.
You can hold the current status in a map or cache and then update the existing object. Example:
class Gate {
final AtomicInteger health = new AtomicInteger(100);
}
Cache<Id, Gate> cache = ....;
void decreaseHealth(Id id, int damage) {
Gate gate = cache.get(id);
gate.health.addAndGet(-damage);
}
I present additional code that interacts with the database later and focus on the in memory update first.
If you use a map instead of a cache, you need to use a map that is thread safe, like ConcurrentHashMap
.
Since the changes may happen concurrently you need to use a method to atomically update the health. Above I used AtomicInteger
. Another possibility is the atomic updater or var handles. If you just update a single value, this is approach is most efficient, since it translates to a single CAS operation on the hardware. If you update multiple values in the object, use locks, synchronized
or an atomic operation on the cache/map entry. Example:
class Gate {
int health = 100;
// ....
}
cache.asMap().compute(id, (unused, gate) -> {
gate.health -= damage;
if (gate.health == 0) {
// more changes to the object if destroyed totally
}
return gate;
});
Here is an idea of a working solution based of cache2k that does schedule write behind in case a major change happened.
// mocks
class Id {}
Gate readFromDb() { return null; }
void writeDb(Gate g) { }
class Gate {
final AtomicInteger health = new AtomicInteger(100);
volatile boolean writeScheduled = false;
final AtomicInteger persistentHealth = new AtomicInteger(100);
boolean isDirty() {
return persistentHealth.get() != health.get();
}
}
Cache<Id, Gate> cache =
new Cache2kBuilder<Id, Gate>() {}
.loader((id, l, cacheEntry) -> {
if (cacheEntry == null) { return readFromDb(); }
Gate gate = cacheEntry.getValue();
return gate;
})
.addListener((CacheEntryExpiredListener<Id, Gate>) (cache, cacheEntry) -> {
writeIfModified(cacheEntry.getValue());
})
.refreshAhead(true)
.keepDataAfterExpired(true)
.expireAfterWrite(5, TimeUnit.MINUTES)
.loaderExecutor(Executors.newFixedThreadPool(30))
.build();
void writeIfModified(Gate gate) {
if (!gate.isDirty()) { return; }
int persistentHealth = gate.health.get();
writeDb(gate);
gate.writeScheduled = false;
gate.persistentHealth.set(persistentHealth);
}
long writeBehindDelayMillis = 500;
int changePercentage = 10;
public void decreaseHealth(Id id, int damage) {
Gate gate = cache.get(id);
int persistentHealth = gate.persistentHealth.get();
int newHealth = gate.health.addAndGet(-damage);
if (!gate.writeScheduled) {
int percentage = persistentHealth * 10 / 100;
if (newHealth > persistentHealth percentage ||
newHealth < persistentHealth - percentage) {
cache.invoke(id, entry -> {
entry.setValue(entry.getValue());
entry.setExpiryTime(entry.getStartTime() writeBehindDelayMillis);
entry.getValue().writeScheduled = true;
return null;
});
}
}
}
This operates the cache in read through mode, so a cache.get()
triggers the initial database load. Further more, it uses expiry and refresh ahead to schedule a delayed write, if needed. That is a bit tricky, since the usual and "documented" use case of refresh ahead is a different one. If interested I can explain the detailed mechanics in a blog post. It gets a bit too heavy for a Stack Overflow answer.
Of course, you can use some of the ideas with other cache implementations as well.
One final note: In case map compute is used for atomicy, operations may block if a concurrent synchronous database write is going on. Either do asynchronous writes or use different locking for updates.
It still would be interesting to do a performance comparison to the write back approach via a JPA cache.