My team migrated an ancient application from:
AWS Elastic Compute Cloud instance
Windows Server 2019
IIS
.NET Core 2.2
Public subnets with an Internet Gateway
to
AWS Elastic Container Service (Fargate) task
Amazon Linux 2
Kestrel
.NET 6
Private subnets with no internet access at all
We've also tightened up security extensively in security groups, IAM permissions, and other avenues. There's so many changes that were required that it is difficult to pinpoint where things went wrong.
We started to encounter random 500s (returned by Kestrel directly), 502s (from Application Load Balancer), and 504s (full timeouts, from Application Load Balancer) from our website. There's no obvious cause. All API calls seem to exhibit this behavior, seemingly at random.
Digging in deeper, we found errors like this:
---> System.Net.Sockets.SocketException (0xFFFDFFFE): Unknown socket error
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
--- End of stack trace from previous location ---
at System.Net.Sockets.TcpClient.CompleteConnectAsync(Task task)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at MySql.Data.Common.StreamCreator.GetTcpStream(MySqlConnectionStringBuilder settings, MyNetworkStream& networkStream)
at MySql.Data.MySqlClient.NativeDriver.Open()
at MySql.Data.MySqlClient.Driver.Open()
at MySql.Data.MySqlClient.Driver.Create(MySqlConnectionStringBuilder settings)
at MySql.Data.MySqlClient.MySqlPool.CreateNewPooledConnection()
at MySql.Data.MySqlClient.MySqlPool.GetPooledConnection()
at MySql.Data.MySqlClient.MySqlPool.TryToGetDriver()
at MySql.Data.MySqlClient.MySqlPool.GetConnection()
at MySql.Data.MySqlClient.MySqlConnection.Open()
at -my code happens here-
I suspect the problem goes further than purely our MySQL RDS connection, though. The 502s and 504s suggest that sometimes Kestrel cannot properly respond to requests either, as if it is also experiencing similar problems. However, it doesn't log any errors to that effect at least with its default settings.
CodePudding user response:
Finally after a lot of debugging. We were able to find out the real root cause. We were getting following error sometimes when request were failing.
My teammate took this log and found solution for issue by removing SafeFileHandle which was getting used for window server.