Home > Software design >  Parallel HttpClient requests timing out due to async problem?
Parallel HttpClient requests timing out due to async problem?

Time:10-01

I'm running a method synchronously in parallel using System.Threading.Tasks.Parallel.ForEach. At the end of the method, it needs to make a few dozen HTTP POST requests, which do not depend on each other. Since I'm on .NET Framework 4.6.2, System.Net.Http.HttpClient is exclusively asynchronous, so I'm using Nito.AsyncEx.AsyncContext to avoid deadlocks, in the form:

public static void MakeMultipleRequests(IEnumerable<MyClass> enumerable)
{
    AsyncContext.Run(async () => await Task.WhenAll(enumerable.Select(async c => 
        await getResultsFor(c).ConfigureAwait(false))));
}

The getResultsFor(MyClass c) method then creates an HttpRequestMessage and sends it using:

await httpClient.SendAsync(request);

The response is then parsed and the relevant fields are set on the instance of MyClass.

My understanding is that the synchronous thread will block at AsyncContext.Run(...), while a number of tasks are performed asynchronously by the single AsyncContextThread owned by AsyncContext. When they are all complete, the synchronous thread will unblock.

This works fine for a few hundred requests, but when it scales up to a few thousand over five minutes, some of the requests start returning HTTP 408 Request Timeout errors from the server. My logs indicate that these timeouts are happening at the peak load, when there are the most requests being sent, and the timeouts happen long after many of the other requests have been received back.

I think the problem is that the tasks are awaiting the server handshake inside HttpClient, but they are not continued in FIFO order, so by the time they are continued the handshake has expired. However, I can't think of any way to deal with this, short of using a System.Threading.SemaphoreSlim to enforce that only one task can await httpClient.SendAsync(...) at a time.

My application is very large, and converting it entirely to async is not viable.

CodePudding user response:

This isn't something that can be done with wrapping the tasks before blocking. For starters, if the requests go through, you may end up nuking the server. Right now you're nuking the client. There's a 2 concurrent-request per domain limit in .NET Framework that can be relaxed, but if you set it too high you may end up nuking the server.

You can solve this by using DataFlow blocks in a pipeline to execute requests with a fixed degree of parallelism and then parse them. Let's say you have a class called MyPayload with lots of Items in a property:

ServicePointManager.DefaultConnectionLimit = 1000;

var options=new ExecutionDataflowBlockOptions
{
    MaxDegreeOfParallelism = 10
};

var downloader=new TransformBlock<string,MyPayload>(async url=>{
    var json=await _client.GetStringAsync(url);
    var data=JsonConvert.DeserializeObject<MyPayload>(json);
    return data;
},options);

var importer=new ActionBlock<MyPayload>(async data=>
{
    var items=data.Items;
    
    using(var connection=new SqlConnection(connectionString))
    using(var bcp=new SqlBulkCopy(connection))
    using(var reader=ObjectReader.Create(items))
    {
        bcp.DestinationTableName = destination;
        connection.Open();

        await bcp.WriteToServerAsync(reader);
    }
});


downloader.LinkTo(importer,new DataflowLinkOptions { 
    PropagateCompletion=true
});

I'm using FastMember's ObjectReader to wrap the items in a DbDataReader that can be used to bulk insert the records to a database.

Once you have this pipeline, you can start posting URLs to the head block, downloader :

foreach(var url in hugeList)
{
    downloader.Post(url);
}
downloader.Complete();

Once all URLs are posted, you tell donwloader to complete and await for the last block in the pipeline to finish with :

await importer.Completion;

CodePudding user response:

There are two possible causes:

  • a bug in System.Net.Http.HttpClient in .NET Framework 4.6.2
  • the continuation priority issue outlined in the question, in which individual requests are not continued promptly enough and so time out.

As described at this answer and its comments, from a similar question, it may be possible to deal with the priority problem using a custom TaskScheduler, but throttling the number of concurrent requests using a semaphore is probably the best answer:

using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using Nito.AsyncEx;

public class MyClass 
{
    private static readonly HttpClient httpClient = new HttpClient();
    private static readonly SemaphoreSlim semaphore = new SemaphoreSlim(10);

    public HttpRequestMessage Request { get; set; }
    public HttpResponseMessage Response { get; private set; }
        
    private async Task GetResponseAsync()
    {
        await semaphore.WaitAsync();
        try
        {
            Response = await httpClient.SendAsync(Request);
        }
        finally
        {
            semaphore.Release();
        }
    }
    
    public static void MakeMultipleRequests(IEnumerable<MyClass> enumerable)
    {
        Nito.AsyncEx.AsyncContext.Run(async () => 
            await Task.WhenAll(enumerable.Select(async c =>
                await c.GetResponseAsync())));                                                              
    }
}
  • Related