I created and manage a SOAP API built in ASP.NET ASMX. The API processes about 10,000 requests per day. Most days, about 3 request sent by the client (we only have 1 client) do not reach the web server (IIS). There is no discernible pattern.
We are actually using 2 web servers that sit behind a load balancer. From the IIS logs, I am 100% confident that the requests are not reaching either web server.
The team that manages the network and the load balancer have not been able to 'confirm or deny' whether the problem is occurring at the load balancer. They suggested it's normal for request to sometimes "get lost in the internet", and said that we should add retry logic to the API.
The requests are using TCP (and TLS). The client has confirmed that there is no problem occurring on their end.
My question is: is it normal for TCP requests to "get lost in the internet" at the frequency we are seeing (about 3 out of 10,000 per day).
BTW, both the web server and the client are located in the same country. For what it's worth, the country in question is an anglopshere country, so it's not the case that our internet infrastructure is shoddy.
CodePudding user response:
There is no such thing as a TCP request getting lost since there is no such thing as a TCP request in the first place. There is a TCP connection and within this there is a TLS tunnel and within this the HTTP protocol is spoken - and only at this HTTP level there is the concept of request and response which then is visible in the server logs.
Problems can occur in many places, like failing to establish the TCP connection in the first place due to no route (i.e. no internet) or too much packet loss. There can be random problems at the TLS level caused by bit flips which cause integrity errors and thus connection close. There can be problems at the HTTP level, for example when using HTTP keep-alive and the server closing an idle connection while at the same time the client is trying to send another request. And probably more places.
The client has confirmed that there is no problem occurring on their end.
I have no idea what exactly this means. No problem would be if the client is sending the request and getting a response. But this is obviously not the case here, so either the client is failing to establish the TCP connection, failing at the TLS level, failing while sending the request, failing while reading the response, getting timeouts ... - But maybe the client is simply ignoring some errors and thus no problem is visible at the clients end.