I am facing the following problem and I would like to know other's opinion and suggestion.
I have a third party API with the following query params: Lets say have 10000 storeIds, calling that API would cause 10k calls
?storeId=1&status=active
I have my own service which calls this API with N storeIds which are requested from my own endpoint. Lets say I call my service for example 100 stores
http://localhost:8080?storeIds=[1,2,3,4,5...] up to 100
This call could also be a request from filtered stores:
http://localhost:8080?storeIds=[78,99,104,320,123...] up to N
List<Store> stores = new ArrayList<>(storeIds.size());
storeIds.forEach(storeId -> {
Store store = thirdPartyService.call(storeId);
stores.add(store);
});
Making a parallel call would cause the third party system to collapse since it can not support that many requests x second.
I can not change the third party endpoint to accept N storeIds
In order to solve the issue I implement a Cache nevertheless since I store on my cache each store 1 by 1 it is also causing performance issues because if you have to call the cache 10.000 times.
Since the list of storeIds i receive from my endpoint it is not always the same and it can contain some storeIds that are in caché and some others not I can not retrieve all from the cache so I would have to make the call for them to the third party api.
I would like to know if there is another point of view to solve this issue because storing as cache a Set of storeIds would not solve the issue since for example:
first call: storeIds:[1,2,3,4,5,732,2321] (we have much more here but i simplify this) storedSet on cache Second call:[1,2,3,99,102,232,732] (we have much more)
In the second call some of the elements are on the cache, some others not and even some are not requested. The previous stored Set on cache contains the data of some of them but also some that I do not need thats why I stored the data as 1 storeId = 1 entry on the cache.
Thank you so much!
CodePudding user response:
A lot of cache implementations offer a so called bulk get method: Cache.getAll
. That is doing the trick you want. Here is a solution sketch based on cache2k. The cache setup is:
Cache<Integer, Store> cache =
new Cache2kBuilder<Integer, Store>() {}
.loader(id -> {
// http call to get Store data for given id
return new Store();
})
.loaderThreadCount(4) // numbers of parallel loads
.build();
Then you can retrieve data like so:
// request some data, everything will be retrieved via the loader
Map<Integer, Store> result = cache.getAll(Arrays.asList(1, 2, 3, 4, 5, 6, 7));
// ids 4 and 5 will be taken from the cache, the others are loaded
Map<Integer, Store> result2 = cache.getAll(Arrays.asList(4, 5,10, 12, 17));
This will issue a maximum of 4 parallel load requests to you data source. Asynchronous variants are available as well.
The solution is similar with other caches like Caffeine, EHCache or any JCache compatible cache.