I have a method that reads hashes from Redis:
private Task FetchHashesFromRedis(List<string> redisKeys, ConcurrentBag<LiveDataModel> liveDataModels,
CancellationToken cancellationToken)
{
var parallelism = Environment.ProcessorCount;
var semafore = new SemaphoreSlim(initialCount: parallelism, maxCount: parallelism);
var tasks = new List<Task>();
Parallel.ForEach(redisKeys, (key) =>
{
tasks.Add(ExecuteOne(key, semafore, liveDataModels, cancellationToken));
});
return Task.WhenAll(tasks);
}
redisKeys
list count is always 1000, so it will always make a thousand of requests.
FetchHashesFromRedis
method is always the same
ExecuteOne
method in the first case looks like this:
private async Task ExecuteOne(string redisKey, SemaphoreSlim semafore, ConcurrentBag<LiveDataModel> liveDataModels,
CancellationToken cancellationToken)
{
var liveData = await _getLiveDataFromRedis.ExecuteAsync(redisKey, cancellationToken);
if (liveData != null)
{
liveDataModels.Add(liveData);
}
}
In this first case making 1000 requests to Redis takes 1,5 seconds with all side work that I do with models that I'm getting.
ExecuteOne
method in the second case (with Semaphore) looks like this:
private async Task ExecuteOne(string redisKey, SemaphoreSlim semafore, ConcurrentBag<LiveDataModel> liveDataModels,
CancellationToken cancellationToken)
{
await semafore.WaitAsync(cancellationToken);
try
{
var liveData = await _getLiveDataFromRedis.ExecuteAsync(redisKey, cancellationToken);
if (liveData != null)
{
liveDataModels.Add(liveData);
}
}
finally
{
semafore.Release();
}
}
In this second case making 1000 requests to Redis takes 4,5 seconds with all side work that I do with models that I'm getting. (the same amount of request as in the first case)
So the only difference between first and second case that in second case I use this:
await semafore.WaitAsync(cancellationToken);
and in finally
block I use:
semafore.Release();
Why when I use semafore
it takes more time (up to 3 times more)? Should I use semafore in this case or no? And when should I use semafore?
NOTE: _getLiveDataFromRedis.ExecuteAsync(redisKey, cancellationToken);
method is not thread-safe, it just reading different values from redis and returns LiveDataModel
CodePudding user response:
Why when I use semafore it takes more time (up to 3 times more)?
Probably since it limits the number of concurrent IO operations. Using a semaphore will limit the number of concurrent calls to the number of processors, however, since it involves IO, most of the time will just be waiting, no processor time needed. So limiting concurrency to the number of cores make little sense. Try increasing the maxCount and see if that helps performance.
Should I use semafore in this case or no?
Since it is slower, and does not seem to needed for any threadsafety reason, the answer is probably No.
And when should I use semafore?
I rarely use semaphores. The most compelling reason I know would be if I need a async lock, i.e. a semaphore with maxcount 1. There are specialized use cases for it, but in most cases I find higher level primitives that take care of synchronization more useful.
I might suggest reading up on DataFlow, that might allow you to setup a asynchronous processing pipeline that fit better for your use-case.